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This  thesis  develops  the  application  of  kriging  for  the  controlled  minimization 
of  large  data  sets.  As  a  result  of  this  research,  subsets  of  the  original  huge  data  set 
can  be  selected  such  that  the  largest  estimation  variance  of  the  reconstructed  image 
is  within  an  acceptable  level  of  error  as  defined  by  the  user.  The  benefits  of  this  effort 
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that  contained  in  the  original  data. 
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Abstract 

Frequently,  the  quantity  of  data  available  is  much  greater  than  that  which  can 
be  manipulated  in  an  efficient  and  timely  manner.  This  can  cause  several  problems. 
The  first,  and  probably  most  critical,  problem  is  the  excessive  on-line  storage  needs 
of  these  huge  data  sets.  Secondly,  in  'the  computer  animation  field,  huge  data  sets 
may  require  excessive  computational  time  for  generation  of  each  frame  of  a  computer 
animation.  Thirdly,  computer  screens  have  a  limited  resolution  and  need  too  much 
computational  time  removing  excessive  detail  from  images  generated  with  a  higher 
resolution  than  can  be  displayed.  Lastly^  too  much  time  is  required  to  transmit  huge 
amounts  of  data  from  location  to  location.  What  is  needed  is  a  method  of  minimizing 
the  data  set  based  on  some  acceptable  level  of  resolution. 

This  thesis  develops  the  application  of  kriging  for  the  controlled  minimization 
of  large  data  sets  based  on  a  maximum  acceptable  level  of  error.  Specifically,  the 
geostatistical  estimation  technique  of  kriging  is  used  to  produce  minimal  data  sets 
and  to  estimate  the  unknown  values  on  an  arbitrarily  sized  grid  using  as  input  any 
data  set.  All  the  procedures  necessary  to  improve  the  accuracy  of  the  estimate  as  well 
as  the  kriging  procedure  are  developed.  .Using  these  procedures  the  entire  process 
can  be  automated.  The  techniques  are  demonstrated  using  Magnetic  Resonance 
Image  data  to  support  minimizing  on-line  storage  requirements.  Two  concurrent 
thesis  efforts  also  use  the  techniques  to  enlarge  satellite  photographs  and  to  change 
the  grid  resolution  of  terrain  data. 


THE  APPLICATION  OF  KRIGING  FOR  CONTROLLED 
MINIMIZATION  OF  LARGE  DATA  SETS 


I.  Introduction 

This  research  effort  develops  the  application  of  kriging  to  the  controlled  mini¬ 
mization  of  large  d'^ta  sets.  Kriging  is  a  geostatistical  estimation  procedure  named 
after  D.G.  Krige,  a  South  African  mining  engineer.  Although  kriging  has  its  origins 
in  geostatistics,  the  methods  are  applicable  to  a  wide  range  of  disciplines.  This  first 
chapter  provides  the  background,  the  research  objectives,  and  the  scope  of  the  study. 

1.1  Background 

Historically,  man  has  had  more  data  than  he  can  manipulate  in  an  efficient  and 
timely  manner.  The  quantity  of  data  that  can  be  manipulated  has  grown  dramati¬ 
cally,  but  the  quantity  available  has  grown  even  feister.  What  is  needed  is  a  method 
of  choosing  a  minimal  data  set  based  on  the  desired  level  of  acceptable  error  in  the 
data  needed  for  the  application.  Using  a  subset  of  the  data  will  then  allow  faster 
manipulation  of  that  data  while  maintaining  the  required  level  of  acceptable  error 
in  the  data. 

The  three  areeis  that  will  probably  benefit  most  from  the  use  of  kriging  are 
computer  graphics,  computer  animation,  and  topographical  estimation.  These  were 
the  main  focus  of  this  kriging  development  effort.  All  three  of  these  areas  require 
the  ability  choose  a  subset  of  the  original  larger  data  set  and  estimate  the  unknown 
points  based  on  that  minimal  data  set.  For  enlargement  of  the  original  data  set  the 
subset  is  equal  to  the  original  data  set. 

In  computer  graphics,  an  image  may  be  displayed  on  the  entire  screen  or  it 
may  displayed  on  only  a  very  small  portion  of  the  screen.  Using  the  entire  data  set 


1-1 


in  both  cases  may  waste  substantial  computer  time  calculating  points  that  can’t  be 
displayed  on  the  small  portion  of  the  screen.  Using  a  subset  of  the  entire  data  set 
in  both  cases  may  discard  substantial  detail  needed  for  the  entire  screen  display  of 
the  image.  This  problem  can  be  alleviated  by  having  multiple  subsets  of  the  original 
data  set,  one  for  each  range  of  resolutions  displayed.  A  range  of  resolutions  would 
be  used  to  minimize  the  number  of  subsets  needed. 

In  computer  animation  data  sets  containing  large  numbers  of  graphics  primi¬ 
tives  (points,  lines,  polygons,  patches)  require  great  amounts  of  time  for  generation 
of  each  frame  of  the  animation.  In  this  case  what  is  needed  is  a  minimal  data  set 
based  on  the  desired  resolution  of  the  animated  object.  The  smaller  the  animated 
object  is  relative  to  the  overall  image  size  the  fewer  graphics  primitives  needed  in 
the  minimal  data  set.  Faster  moving  objects  also  require  fewer  graphics  primitives 
in  the  minimal  data  set.  Therefore,  in  selecting  a  minimal  data  set,  smaller  or  faster 
moving  animated  objects  require  fewer  graphics  primitives  in  the  minimal  data  set 
than  larger  or  slower  moving  objects. 

Part  of  this  approach  involves  estimating  points  that  aren’t  known  because 
they  were  removed  while  producing  a  minimal  data  set  or  were  not  known  in  the 
first  place.  This  aspect  of  the  method  can  then  be  used  for  topographical  estimation. 
A  photograph  can  be  enlarged  by  estimating  points  between  known  points  of  the 
original  image.  This  does  not  add  detail  but  can  aid  a  photo-interpreter  in  discerning 
more  information  from  the  enlarged  image.  This  aspect  can  also  be  used  to  generate 
biofidelic  head  forms  to  assist  in  the  design  of  environmental  protection  equipment. 
This  involves  estimating  a  general  head  shape,  using  many  head  shape  data  sets, 
thereby  minimizing  the  variability  between  the  different  head  shapes. 

All  these  uses  minimize  the  amount  of  data  that  must  be  readily  available  on 
immediat  e  access  storage  devices.  The  large  original  data  sets  can  be  stored  on  slow 
access,  long  term  storage  devices. 
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l.S  Research  Objective 

The  purpose  of  this  research  effort  was  to  develop  a  procedure  that  could  select 
a  minimal  data  set  in  a  statistically  controlled  fashion  and  estimate  unknown  values, 
at  regular  grid  locations,  based  on  a  desired  overall  maximum  acceptable  level  of 
error. 

1.3  Scope 

This  thesis  develops  and  demonstrates  the  application  of  kriging  in  the  con¬ 
trolled  minimization  of  large  data  sets.  Specifically,  this  study  includes  a  discussion 
of  the  theoretical  development  of  kriging  and  the  computer  programs  necessary  for 
the  application  of  the  technique.  The  following  provides  a  summary  of  the  extent  of 
this  research  effort. 

1.3.1  Theory  of  Kriging  The  literature  review  provides  an  introduction  to 
the  theory  and  the  development  of  kriging  from  the  field  of  geostatistics.  Emphasis 
is  placed  on  the  kriging  equations  and  structural  analysis  of  the  data. 

1.3.2  Procedural  Development  A  complete  development  of  the  kriging  proce¬ 
dure  is  outlined  in  Chapter  III. 

1.3.3  Kriging  Programs  This  document  includes  a  complete  package  of  the 
programs  required  for  the  kriging  of  data.  Specifically,  the  following  programs,  all 
written  in  C-f-l-,  are  included: 

Zonal  Trend  Partitioning  Program.  This  partitions  the  data  based  on  varia¬ 
tions  of  the  data  from  the  row  and  column  median  values  of  the  data.  This  program 
isolates  the  zonal  anisotropic  behavior  of  the  data  into  data  subsets. 

Global  Trend  Removal  Program.  This  determines  the  least  squares  coefficients 
for  a  polynomial  surface  through  the  points  and  outputs  the  residuals. 
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Least  Squares  Semivariogram  Estimation  Program.  This  determines  the  least 
squares  parameters  for  three  of  the  more  commonly  used  semivariogram  models:  the 
linear,  the  De  Wijsian,  and  the  spherical  models. 

Kriging  Program.  This  program  selects  the  minimal  data  set  and/or  estimates 
the  values  at  the  grid  locations  and  provides  the  kriging  variances  for  these  points. 

Global  Trend  Addition  Program.  This  uses  the  coefficients  for  a  polynomial  sur¬ 
face  through  the  points  to  add  the  global  trend  to  the  kriged  residuals  and  outputs 
the  estimates. 
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IL  Literature  Review 


The  first  step  is  to  research  the  current  methods  pertaining  to  the  application 
of  kriging  and  the  structural  analysis  needed  for  kriging.  Therefore,  this  chapter 
provides  a  review  of  the  literature  pertaining  to  the  areas  of  kriging  and  structural 
analysis.  The  emphaisis  of  this  review  is  on  the  development  of  kriging  in  the  field 
of  geostatistics  and  the  structural  analysis  essential  to  the  application  of  the  kriging 
procedures. 

2.1  Structural  Analysis 

The  goal  of  structural  analysis,  as  it  pertains  to  using  the  kriging  method,  is 
to  determine  the  spatial  distribution  of  the  quantity  of  interest,  the  semivariogram. 
Before  characterizing  the  spatial  distribution,  however,  the  presence  of  global  trend 
must  be  determined  and,  if  present,  removed  from  the  data.  In  kriging,  the  spatial 
distribution  of  the  quantity  of  interest  is  characterized  by  the  semivariogram.  The 
semivariogram  is  a  function  describing  the  expected  difference  in  value  between  pairs 
of  samples  with  a  given  spatial  relationship  (4:11). 

Yakowitz  and  Szidarovszky  note: 


The  kriging  method  is  composed  of  two  activities,  (i)  inferring  the 
semivariogram  from  the  data  and  (ii)  assuming  that  the  inferred  semi¬ 
variogram  is  indeed  exact,  providing  a  best  linear  unbiased  estimator  and 
associated  error  variance.  (20:23-24) 


Journel  and  Huijbregts  emphasize  that  the  first  and  most  important  step  in  any 
geostatistical  study  is  structural  analysis  (12:12).  “Structural  analysis  is  the  name 
given  to  the  procedure  of  characterizing  the  structures  of  the  spatial  distribution  of 
the  variables  considered  (e.g.,  grades,  thicknesses,  accumulations)”  (12:12). 
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2.1.1  Trend  Remrval  The  goal  of  computing  and  using  the  semivariogram 
is  to  improve  the  accuracy  of  the  estimate  of  the  desired  quantity  at  the  desired 
location.  To  achieve  this  goal,  it  is  necessary  to  remove  any  trend  in  the  data.  In 
this  context  trend  is  the  tendency  for  the  local  mean  to  increase,  or  decrease,  as  a 
function  of  spatial  location  in  the  data.  One  method  to  remove  the  trend  is  to  fit  a 
polynomial  to  the  data  and  then  subtract  the  value  of  the  polynomial  at  each  point 
from  the  known  value  at  that  point.  This  method  provides  a  continuous  trend  value 
for  any  point  and,  for  this  research  effort,  is  preferre'^  over  the  discrete  method  of 
median  polishing  as  suggested  by  Cressie  (5).  Ihe  problem  with  discrete  methods 
is  that  the  trend  is  only  known  at  the  original  data  sites.  Therefore,  the  trend  can 
only  be  estimated  at  the  desired  location  and  added  to  the  estimate  of  the  residual 
for  that  location. 

2.1.2  Cluster  Analysis  Another  method  to  improve  the  accuracy  of  the  esti¬ 
mate  is  to  perform  cluster  analysis  on  the  data  before  calculating  the  semivariogram. 
Cluster  analysis  refers  to  a  number  of  techniques  which  classify  objects  in  homoge¬ 
neous  and  distinct  groups  (1).  The  definition  of  a  cluster  is  often  determined  by  the 
researcher.  The  goal,  however,  is  to  partition  the  original  data  set  into  subsets  that 
contain  some  degree  of  similarity.  The  method  chosen  is  to  calculate  the  row  and 
column  sums,  find  the  medians  of  the  row  and  column  sums,  and  then  partition  the 
data  based  on  row  and  column  sums  that  fall  below,  or  above,  the  corresponding 
row  or  column  median. 

2.1.3  The  Variogram  The  variogram  is  twice  the  semivariogram  and  accord¬ 
ing  to  Omre,  “the  variogram  function  is  the  backbone  of  geostatistical  analysis” 
(15:107).  The  semivariogram  function  is  defined  as  the  variance  of  the  difference 
of  the  quantity  of  interest  between  two  points,  [Z(xi,yi)  -  Z{x2,y2)].  Using  this 
definition,  it  can  be  seen  that  the  semivariogram  is  zero  between  a  point  and  itself. 
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The  variogram  function  is  written  as: 

27{/ii2)  =  Var{Z{xx,yi)  -  Z{x2,y2)} 

Where  h\2  is  the  distance  between  the  points  (xi,  yx)  and  (12,  yi)-  The  semivariogram 
is  simply  7(^12)- 

In  practice,  only  an  estimator  of  the  theoretical  semivariogram  is  available. 
This  estimator  is  known  as  the  experimental  semivariogram  and  is  calculated  as 
follows: 

1  |Ar| 

where  \N\  is  the  number  of  pairs  of  data  values  at  a  distance  of  h  apart  from  one 
another,  x,-  is  the  location  of  point  i,  x,-  +  A  is  the  location  of  a  point  at  distance  h 
from  t,  and  z(x,)  and  z{x,  +  h)  are  the  values  of  the  quantity  of  interest  at  i  and 
Xi  +  h.  An  example  of  an  experimental  semivariogram  is  shown  in  Figure  2.1. 

The  next  step  in  the  structural  analysis  procedure  is  to  fit  a  model  to  the 
experimental  semivariogram  so  that  semivariogram  values  may  be  calculated  for  the 
points  to  be  estimated.  The  approach  taken  is  to  calculate  the  parameters  for  each 
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model  using  a  numerical  least-squares  fitting  routine  and  then  select  the  model  with 
the  best  simple  correlation.  Since  the  number  of  pairs  of  points  h  distance  apart  used 
in  calculating  the  semivariogram  decreases  as  h  increases  Cressie  proposes  minimizing 
a  weighted  sum  of  squares  and  indicates  that  work  by  Zimmerman  and  Zimmerman 
shows  that  the  weighted  least  squares  approach  never  performs  poorly  and  usually 
does  well  (5:198).  Clark  suggests  limiting  h  to  half  the  largest  distance  in  the  data 
set  (4:14). 

2.1.4  Standard  Mndels  Three  of  the  more  common  models  are  the  spherical 
model,  tiie  linear  model,  and  the  De  Wijsian  model  (6:120- 122).  A  brief  introduction 
to  each  <..f  these  models  is  provided. 


Spherical  Model  (6:80)  The  spherical  model  is  the  most  common 
model  and  is  defined  by  three  parameters:  a,  C,  and  Co.  The  first  parameter, 
a,  is  called  the  range  and  is  used  to  determine  the  range  of  influence.  The  third 
parameter,  Co,  is  known  as  the  nugget  effect.  Finally,  the  second  parameter,  C,  is 
used  in  conjunction  with  Co  to  determine  the  sill,  (C-t-Co),  defined  as  the  covariance 
of  the  samples  at  A=0.0,  cr(0).  The  form  of  the  spherical  model  is  as  follows: 


C  -f  Co 
0 


if  ^  >  a 
if  =  0 


The  shape  of  this  model  is  shown  in  Figure  2.2. 


Linear  Model  The  equation  for  the  linear  model  is  of  the  form 
7(A)  =  ah +  b.  This  is  one  of  two  models  used  in  practice  which  does  not  have  a  sill 
(6:120).  The  shape  of  this  model  is  shown  in  Figure  2.3. 
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7(/i) 


h 

Figure  2.4.  De  Wijsian  Semivariogram 

Dc  Wijsian  Model  The  form  for  the  De  Wijsian  model  is  7(A)  = 
a  \n{h)+b.  However,  ‘one  usually  writes  a  =  3a  and  calls  it  the  coefficient  of  intrinsic 
dispersion”  (6:121).  This  model  also  does  not  have  a  sill.  This  model  is  named  after 
Prof.  H.J.  de  Wijs  and  is  used  when  the  experimental  data  plots  as  a  straight  line 
on  a  logarithmic  scale  (6:120).  The  shape  of  this  model  is  shown  in  Figure  2.4. 

8.1.5  Problems  with  Anisotropy  Anisotropies  are  typically  classified  in  one 
of  two  categories:  geometric  and  zonal  (6:134).  Geometric  anisotropy  refers  to  the 
situation  where  the  value  or  expected  variation  varies  more  quickly  in  one  direction 
than  in  another.  This  is  evidenced  by  different  semivariogram  ranges  in  different 
directions  but,  for  the  spherical  model,  identical  sills.  This  type  of  anisotropy  can 
be  handled  by  scaling  the  coordinates  of  the  data  sets  or  by  using  different  semivari- 
ograms  for  different  directions.  The  method  chosen  is  to  scale  the  coordinates  of  t'^e 
data  sets.  Zonal  isotropy  is  characterized  by  qualitative  variations  or  separations  ^ 
the  data  into  zones.  This  form  is  very  difficult  to  treat.  In  the  spherical  rn<  ■’el  this 
is  evidenced  by  different  sills  in  different  directions. 

The  anisotropy  ratio  (or  affinity  modulus),  k,  is  equal  to  the  ratio  of  the  ranges 
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in  each  direction,  k  =  a{y)/a{x).  For  example,  if  the  range  is  50  feet  in  the  x  direction 
and  300  feet  in  the  y  direction,  then  k  is  equal  to  300/50  (6:134-135).  The  distance 
vector,  h,  can  be  decomposed  into  two  components,  hi  =  (xi  —  X2)  along  the  x  axis, 
and  h2  =  {yi  —  2/2)  along  the  y  axis.  Therefore,  using  the  semivariogram  model 
calculated  in  the  y  direction,  the  scaled  distance  between  the  two  points  (xi,yi)  and 
{X2.y2)  is  h  =  ^  (ii  —  X2y  +  {yi  —  ^2)^-  This  treatment  of  geometric  anisotropy 
can  be  extended  to  three  dimensions  by  introducing  a  second  anisotropy  ratio  for 
the  third  direction,  c  =  a{y)fa{z),  and  multiplying  it  by  the  change  in  distance  in 
that  direction.  Treating  zonal  anisotropy  is  beyond  the  scope  of  this  study;  for  more 
information  reference  David  (6:134-148). 

S.S  Kriging 

A  review  of  the  literature  showed  some  applications  of  kriging  in  the  controlled 
minimization  of  large  data  sets.  In  particular,  Ferenc  Szidarovszky  explains  how  a 
minimal  data  set  can  be  constructed  by  testing  all  possible  subsets  of  the  data  and 
using  a  branch  and  bound  technique  to  reduce  the  time  required  for  this  testing 
(18).  This  review  explains  the  theory  of  the  technique  eis  developed  in  the  field 
of  geostatistics  and  applied  to  this  effort.  Specifically,  the  following  kriging  topics 
are  discussed:  the  origin,  a  definition,  the  fundamental  equations,  the  universal 
equations,  the  assumptions,  and  several  types  of  kriging. 

2.2.1  Origin  of  Kriging  The  method  of  kriging  traces  its  origins  to  the  field  of 
geostatistics  as  developed  in  the  mining  industry.  According  to  Journel:  “in  mining 
practice,  one  problem  is  to  find  the  best  possible  estimator  of  the  mean  grade  of  a 
block”  (11:563).  He  further  states  that  D.G.  Krige  proposed  a  regression  technique 
for  this  problem  in  1951  and  that  “in  1963,  Matheron  formalized  and  generalized  this 
regression  procedure  and  gave  it  the  name  of  kriging”  (11:563)  after  D.G.  Krige,  a 
mining  engineer  in  South  Africa.  Georges  Matheron  is  also  credited  with  introducing 
the  concept  of  regionalized  variables.  According  to  Matheron: 


Geostatistics,  in  their  most  general  acceptation,  are  concerned  with 
the  study  of  the  distribution  in  space  of  useful  values  for  mining  engi¬ 
neers  and  geologists,  such  as  grade,  thickness,  or  accumulation,  includ¬ 
ing  a  most  important  practical  application  of  the  problems  arising  in 
ore-deposit  evaluation.  (13:224) 

2.2.2  Definition  Matheron  originally  defined  kriging  as  follows:  “kriging  is 
the  probabilistic  process  of  obtaining  the  best  linear  unbiased  estimator  of  an  un¬ 
known  variable”  (11:563).  In  this  context,  “best”  is  defined  as  “having  the  smallest 
estimation  variance”  (4:104).  Matheron  later  generalized  the  techniques  for  obtain¬ 
ing  nonlinear  unbiased  estimates.  Journel  states  that  kriging  should  be  redefined 
as  “a  probabilistic  theory  of  estimation  bitsed  on  the  principle  of  minimization  of 
the  estimation  variance”  (11:363).  Therefore,  kriging  is  a  method  of  estimating  an 
unknown  value  at  a  point  based  on  known  values  of  surrounding  points  with  the 
constraint  that  the  estimation  error  is  minimized. 

2.2.3  Kriging  Equations  The  estimate  for  an  unknown  value  at  a  point  is  the 
weighted  average  of  surrounding  values  with  the  closer  points  having  more  weight 
than  points  further  away.  Specifically,  the  equation  for  the  estimator  is: 


Xq  =  WiXi  -|-  W2X2  +  W3X3  . . .  -f-  WnX„ 


where  Xq  is  the  estimate,  lUi,  W2-,  W3,  ...,  w„  are  the  weights,  and  Xi,  X2,  X3,  . . ., 
X„  are  the  sample  values  (4:99). 

For  the  estimator  to  be  unbiased  the  weights  must  sum  to  one  and  there  must  be 
no  trend.  An  unbiased  estimator  is  one  in  which,  over  a  large  number  of  estimations, 
the  average  error  is  zero.  Trend  is  the  tendency  for  the  local  mean  to  increase,  or 
decrease,  as  a  function  of  spatial  location  in  the  data.  To  determine  the  weights  that 
minimize  the  estimation  variance,  the  estimation  variance  must  be  defined.  The 
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estimation  variance  of  Xo  for  the  general  unbiased  linear  estimator  is: 

Var{Xo  -X)  =  'f^  wa{hip)  -  ^  WiWj^{hij) 

t=l  isl  j=l 

where  Xo  —  X  is  the  estimation  error,  Var(Xo  —  X)  is  the  variance  of  this  error, 
Wi  and  Wj  are  the  weights,  7(A«p)  is  the  semivariogram  value  between  the  value 
being  estimated  and  the  known  value  at  point  i,  and  is  the  semivariogram 

value  between  the  known  value  at  point  i  and  the  known  value  at  point  j.  The 
semivariogram  was  discussed  in  the  structural  analysis  section  of  this  review.  For 
any  given  set  of  observations  the  variance  is  a  function  only  of  the  values  of  the 
weights.  Therefore,  to  minimize  the  estimation  variance,  the  partial  derivatives  of 
the  estimation  variance  with  respect  to  the  weights  must  be  set  to  zero  and  the 
weights  must  be  determined  by  solving  the  resulting  system  of  equations.  The  result 
is  the  following  system  of  linear  equations  (referred  to  as  the  kriging  system): 

0  +  Wi  +  W2  +  W3  +  . . .  +  Wn  =  1 

A  +  u;i7(/in)  +  1^27(^12)  +  +  •••  +  tt’n7(Ain)  =  7(Aip) 

A  +  Wi'r{h2l)  +  W2'){h22)  +  X03lf{h23)  +  •••  +  U'V7(A2n)  =  7(^2?) 

A  +  lUl7(^3l)  +  ^27(^32)  +  W3'y{h33)  +  +  Wn'y{h3n)  =  7(^3?) 

:+  :  +  :  +  :  +  :+  :  =  : 

A  +  Wl7(ftnl)  +  W2l[{hn2)  +  W37(An3)  +  ...  +  «'’n7(^  nn)  —  yi^np) 

As  stated  previously,  to  maintain  the  unbiased  nature  of  the  estimate,  the  weights 

must  sum  to  one.  A  Lagrangian  Multiplier  (A)  is  used  so  that  the  number  of  un¬ 

knowns  and  the  number  of  equations  are  equal.  Using  the  calculated  weights  the 
minimized  estimation  variance  of  the  estimate,  Xq,  is: 

Var{Xo-X)  =  j2wnihip)  +  X 

t=i 

It  is  important  to  notice  that  the  weights  and  the  Lagrangian  Multiplier  depend 
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only  on  the  semivariogram  and  therefore  the  estimation  variance  depends  only  on 
the  semivariogram.  As  shown  in  the  structural  analysis  section,  the  semivariogram  is 
a  function  of  the  relative  distance  and  orientation  of  the  known  points,  the  pattern. 
Therefore,  the  estimation  variance  depends  only  on  the  pattern  of  the  points,  partic¬ 
ularly  the  distances  between  the  known  points  and  the  distances  between  the  point 
to  be  estimated  and  the  known  points,  and  not  the  values  at  these  known  points. 
This  fact  will  be  very  useful  later  in  the  procedure  development. 

Following  a  similar  development  using  the  covariance  function,  which  is  a  func¬ 
tion  of  the  semivariogram  as  follows:  c-{A,j)  =  a(0)  —  an  equivalent  form  of 

the  kriging  equations  can  be  derived  as  follows: 

0  -f  Wi  +  Wi  -b  W3  -f-  . . .  +  Wn  =  1 

X  +  Wia(hn)  +  «>2<J‘(Ai2)  +  +  •••  +  =  <r(Aip) 

A  -f  Wia{h2i)  +  W2<T{h22)  +  +  •••  +  ttf„<7(A2n)  = 

A  -I-  W\(T(hz\)  +  W2<T{hyi)  -f-  VOz(f{hyi)  +  •••  +  t/’n<^(^3n)  = 

:  +  :  +  :  +  :  -f  :  +  :  s=  : 

A  -I-  Wiar{knl)  +  W2(r{hn2)  +  W3<T(hrCi)  +  ...  +  U^n<^(Ann)  =  <^(^np) 

Both  forms  of  the  equations  can  be  put  into  the  matrix  form  [A]  {X}  =  {B},  where 
[A]  is  the  square  matrix  of  the  semivariogram  (or  covariance)  values  between  the 
known  points,  {X}  is  the  column  vector  of  weights,  and  {B}  is  the  column  vector 
of  the  semivariogram  (or  covariance)  values  between  the  point  being  estimated  and 
the  known  points.  In  matrix  form,  the  solution  is  found  by  inverting  [A]  and  post- 
multiplying  by  {X}.  As  shown  in  the  structural  analysis  section,  the  value  of  the 
semivariogram  between  the  point  and  itself  is  zero  but  the  covariance  is  not.  What 
this  means  is  that  [A],  in  the  semivariogram  form,  has  a  zero  for  all  the  diagonal 
terms  but  the  covariance  form  does  not.  This  makes  the  covariance  form  more  stable 
during  matrix  inversion  than  the  semivariogram  form.  For  this  reason  the  covariance 
form  of  the  equations  is  preferred.  However,  since  the  covariance  docs  not  exist  for  ail 
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theoretical  semivariogram  models,  both  forms  of  the  equations  are  developed  in  this 
thesis.  The  minimized  estimation  variance  of  the  estimate,  Xo,  for  the  covariance 
form  is: 

Var{Xo  -  X)  =  <r(0)  -  Wia{hip)  -  A 

»=i 

2.2.4  Kriging  Assumptions  The  underlying  assumption  of  kriging  is  that  the 
values  fall  within  some  probability  distribution,  usually  a  normal  or  lognormal  dis¬ 
tribution.  This  assumption  allows  statistical  methods  to  be  applied  to  the  data. 
Kriging  usually  assumes  some  form  of  stationarity.  The  most  stringent  form  is  strong 
stationarity  in  which  aJl  higher  order  moments  exist  and  are  constant  everywhere 
over  the  data  field.  The  second  form  is  weak  stationarity.  Weak  stationarity  im¬ 
plies  that  all  random  variables  have  the  same  mean,  variance  and  autocorrelation 
function.  This  assumption  is  based  on  two  conditions:  1)  the  expected  value  of  the 
regionalized  variable  is  the  same  all  over  the  field  of  interest;  and,  2)  the  spatial 
covariance  of  the  regionalized  variable  is  the  same  all  over  the  field  of  interest  (6:92). 
The  assumption  of  weak  stationarity  can  be  further  relaxed  by  requiring  only  that 
all  random  variables  in  a  subsection  of  the  data  field,  the  neighborhood,  have  the 
same  first  and  second  order  moments  after  removal  of  any  local  drift.  This  is  the 
assumption  used  for  universal  kriging. 

2.2.5  Types  of  Kriging  The  more  popular  types  of  kriging  are  point,  block, 
lognormal,  disjunctive,  universal  kriging,  and  cokriging.  “The  kriging  techniques 
are  all  related,  and  are  refined  versions  of  the  weighted  moving  average  techniques 
used  by  Krige”  (10:25).  Point  and  universal  kriging  are  central  to  this  effort  and  are 
discussed  below  in  more  detail. 

Point  Kriging.  The  system  of  kriging  equations  previously  devel¬ 
oped  were  those  specific  to  point  kriging.  Davis  discusses  this  simplest  form  of 
kriging  and  provides  an  example  to  illustrate  the  mechanics  of  the  kriging  system. 
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Table  2.1.  Water  Table  Elevation  Data 


Location 

X  Coordinate 

Y  Coordinate 

Water  Table 
Elevation 

■Egn 

3.0 

4.0 

mggjgmm 

6.3 

3.4 

■SI 

■SSH 

2.0 

1.3 

Point  p 

3.0 

3.0 

■■■ 

The  following  example  uses  the  semivariogram  form  of  the  kriging  equations  and  is 
adapted  from  Statistics  and  Data  Analysis  in  Geology  and  demonstrates  the  use  of 
kriging  in  estimating  the  water  elevation  at  an  unsampled  location  (7:386-390). 

The  basic  problem  is  to  estimate  the  water  elevation  at  some  point  p  based  on 
the  elevations  at  three  other  points  in  the  general  /icinity.  The  coordinates  eind  the 
water  table  elevations  at  these  points  are  listed  in  Table  ..1.  A  structural  analysis 
determined  the  semivariogram  for  the  neighborhood  of  20  km  to  be  linear  with  a 
slope  of  4.0  m*/ km. 

After  solving  the  kriging  equations  to  determine  the  weights,  the  estimate  of 
the  water  elevation  at  point  p  can  be  calculated.  The  kriging  equations  used  to 
determine  the  weights  are: 


0.0  1.0  1.0  1.0 

f  ^ 

A 

1.0 

1.0  7(/iu)  niihu)  7(^13) 

.  4 

Wi 

►  =  < 

7(Aip) 

1-0  7(^21)  7(^22)  7(^23) 

W2 

7(/l2p) 

1.0  7(/i3i)  7(^32)  7(^33)  . 

W3 

,  7(/»3p)  , 

Using  the  distance  between  the  points,  h,  and  the  equation  for  the  semivari¬ 
ogram,  'y{hij)  =  4.0  *  h,  the  above  equations  are  rewritten  as: 
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0.0 

1.0 

1.0 

1.0 

•  ' 

A 

f  \ 

1.0 

1.0 

0.0 

13.4 

11.5 

•  < 

IWl 

.  =  < 

4.0 

1.0 

13.4 

0.0 

19.1 

W2 

13.3 

1.0 

11.5 

19.1 

0.0 

.  “^3  . 

7.9 

k  / 

Solving  these  equations  produces  the  following  estimates  for  the  weights; 


A 

-0.7267 

Wi 

0.6039 

•  =  i 

W2 

0.0868 

W3 

> 

0.3093 

k  4 

The  elevation  at  p  is  determined  as: 

Xq  =  0.6039  •  120.0  +  0.0868  •  103.0  +  0.3093  •  142.0  =  125.3  meters 
and  the  estimation  variance  is  determined  as: 

Var{Xo  -X)  =  0.6039  •  4.0  +  0.0868  •  13.3  +  0.3093  •  7.9  -  0.7267  =  5.3  m^/km 

Universal  Kriging.  Many  data  sets  are  not  stationary.  There  are 
two  main  causes  of  nonstationarity,  global  trend  and  local  drift  (6:238).  As  discussed 
in  the  structural  analysis  section,  global  trend  applies  to  the  entire  data  set  and  may 
be  removed  before  using  kriging.  Local  drift  is  trend  in  the  neighborhood  of  the 
point  being  estimated.  Therefore,  a  method  of  compensating  for  the  local  drift  is 
needed.  Universal  kriging  is  used  when  local  drift  may  be  present.  A  nonstationary 
regionalized  variable  is  composed  of  drift  and  the  residual  (7:393).  The  drift  is  the 
expected  value  of  the  variable  in  a  neighborhood  and  the  residual  is  the  difference 
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between  the  drift  and  the  actual  value.  In  this  form  of  kriging,  the  drift  is  removed 
from  the  regionalized  variable  by  incorporating  equations  for  the  drift  directly  in  the 
kriging  equations,  and  the  stationary  residuals  are  kriged.  In  short, 


Universal  kriging  can  thus  be  regarded  as  consisting  of  three  opera¬ 
tions:  First,  the  drift  must  be  estimated  and  removed.  Then,  the  sta¬ 
tionary  residuals  are  kriged  to  obtain  needed  estimates.  Finally,  the 
estimated  residuals  are  combined  with  the  drift  to  obtain  estimates  of 
the  actual  surface.  (7:393) 


The  drift  is  generally  represented  by  a  first  or  second-order  polynomial.  Davis 
(7:394-395)  provides  the  matrix  form  of  the  universal  kriging  system  when  the  first- 
order-polynomial  drift  at  a  point  p  is  defined  as: 


Mp  ~  +  CC'iX2i 


In  this  equation,  the  a’s  are  drift  coefficients  which  must  be  estimated  and  Xu  and 
X2i  are  the  coordinates  of  the  i**  control  point. 

The  equations,  in  matrix  form,  are  as  follows: 


0 

0 

0 

1 

1 

1 

A 

■  ' 

1 

0 

0 

0 

Ol 

X, 

0 

0 

0 

Y2  ••• 

K 

OC2 

Yp 

1 

7(^11) 

7(M  ••• 

7(^ln) 

•  < 

Wi 

.  =  . 

7(/iip) 

1 

Y2 

7(^21) 

7(/j22)  ••• 

l{h2n) 

W2 

liK) 

1 

Xn 

K 

7(^nl) 

7(^n2)  ••• 

'^{hnn) 

Wn 

\  J 

2-14 


The  corresponding  minimized  estimation  variance  of  the  estimate,  Xq,  is: 


VariXo  -X)  =  '£^a{hip)  +  A  +  a^Xj,  +  azTp 

i=i 

where  A’i,Al2,.  . .  are  the  x  coordinates  of  the  known  points,  Yi,Y2,. . .  jVIi  are 
the  y  coordinates  of  the  known  points,  Xp  is  the  x  coordinate  of  the  point  being 
estimated,  and  Yp  is  the  y  coordinate  of  the  point  being  estimated.  The  covariance 
form  of  the  equations  and  estimation  variance  are: 
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Var{Xo  -X)  =  <7(0)  - 


•  n 
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U=i 


2.S  Data  Set  Minimization 

Data  set  minimization  requires  the  repeated  use  of  kriging  to  determine  the 
minimal  data  set.  The  following  procedure  is  used  to  determine  a  minimal  data  set 
(17); 

1)  Choose  an  initial  set  of  points, 

2)  Calculate  the  unknown  surface  point  values  and  estimation  error, 

3)  Find  the  location  and  value  of  the  largest  estimation  variance. 
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4)  If  the  largest  estimation  error  is  greater  than  the  maximum  acceptable  level 
of  estimation  error  set  by  the  user,  add  a  point  at  that  location  and  go  to  step  2, 
else  stop  the  procedure. 

In  general,  to  produce  an  optimum  minimal  data  set  requires  testing  all  possible 
combinations  of  the  data  and  choosing  the  one  with  the  fewest  points  that  meets  the 
maximum  variance  criteria.  This  would  require  great  amounts  of  computational  time 
even  with  a  branch  and  bound  algorithm.  This  method  is  described  by  Szidarovszky 
(18).  The  method  chosen  for  this  effort  is,  as  explained  previously,  to  add  a  point  at 
the  location  of  maximum  estimation  variance  if  that  variance  exceeds  that  required 
by  the  user.  This  will,  in  general,  produce  a  suboptimal  data  set,  but  the  difference 
between  this  and  the  optimal  set  will  not  be  significant  and  is  well  worth  the  trade-off 
in  computational  time  (17). 

S.4  Summary 

Kriging  and  structural  analysis  topics  from  the  literature  were  presented.  This 
review  presented  the  origin  of  kriging,  defined  kriging,  and  presented  the  kriging 
system  of  equations.  Several  of  the  more  commonly  used  models  for  the  theoretical 
semivariogram  were  discussed. 

With  the  understanding  of  the  kriging  process  and  data  set  minimization  pro¬ 
vided  by  this  literature  review,  the  objective  of  this  effort  can  be  restated  more  pre¬ 
cisely.  Therefore,  the  objective  of  this  effort  is  to  develop  a  procedure  that  produces 
a  minimal  data  set,  in  a  reeisonable  amount  of  time,  that  meets  the  user  provided 
maximum  acceptable  level  of  estimation  error  using  the  kriging  process.  The  pro¬ 
cedure  must  also  perform  surface  kriging  without  minimization.  The  minimal  data 
set  produced  is  allowed  to  not  be  the  optimal  data  set  due  to  computational  time 
limitations. 
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III.  Methodology 


This  chapter  presents  the  methodology  used  in  completing  the  objective  out¬ 
lined  in  Chapter  1.  Kriging  involves  both  the  structural  analysis  of  the  data  and 
the  determination  of  the  estimates  and  error  variances.  These  two  activities  were 
treated  as  separate  tasks.  The  first  task  is  discussed  below  under  Structural  Analysis 
and  the  second  task  is  considered  under  the  heading  of  Kriging.  The  techniques  for 
minimization  are  discussed  under  the  heading  of  Minimization.  The  fourth  section 
documents  the  application  of  the  kriging  procedures. 

Supporting  Tasks.  In  support  of  the  kriging  procedures  and  the 
need  for  a  common  interface  that  isolates  the  input  data  format  from  the  kriging 
procedures,  two  packages,  written  in  C-f+,  are  provided.  The  first  is  a  matrix 
object  that  provides  for  creation  and  manipulation  of  matrices.  The  second  is  an 
interface  routine  package  that  isolates  each  procedure  from  the  data  format  and  the 
control  information  input.  These  packages  are  documented  in  Appendix  E,  The 
Programmers  Manual. 

3.1  Structural  Analysis 

Structural  analysis  is  key  to  the  efficient  and  optimal  implementation  of  kriging 
in  any  field.  This  analysis  must  partition  the  data  into  homogenous  groupings  if 
necessary,  ensure  the  global  stationarity  of  the  data  by  removing  any  global  trend, 
calculate  the  experimental  semivariogram,  and  estimate  the  parameters  of  the  three 
commonly  used  theoretical  semivariogram  models. 

3.1.1  Trend  Analysis  A  value  for  the  trend  is  needed  at  arbitrary  locations. 
A  method  that  provides  exact  values  at  those  arbitrary  locations  is  preferred  over  a 
method  that  requires  estimating  the  values  at  those  locations  (17).  Therefore,  the 
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method  of  choice  for  removing  global  trend  is  to  fit  a  polynomial  of  the  form: 

A  +  Bx  +  Cy  +  Dx^  +  Ey"^  +  Fxy  =  z 

through  the  data.  A  least  squares  regression  technique  is  used  to  determine  the 
coefficients.  The  program  uses  the  common  interface  routines  to  get  control  infor¬ 
mation  from  the  control  file  and  to  read  in  and  write  out  the  data.  The  trend  may 
be  removed  from  the  data  before  or  after  it  is  partitioned.  This  program  v;as  written 
in  C-|-f  by  Wayne  McGee  (14). 

S.l.S  Cluster  Analysis  The  following  are  the  steps  involved  in  partitioning 
the  data  into  homogenous  groupings.  As  the  dafa  is  read  in  the  row  and  column 
sums  are  calculated  and  the  row  and  column  sum  medians  are  then  estimated.  The 
next  step  is  to  identify  the  rows  and  columns  whose  sum  is  above  or  below  the 
corresponding  median.  The  last  step  is  to  write  the  partitioned  data  to  the  disk 
along  with  information  about  the  partitions.  The  data  set  may  be  partitioned  before 
or  after  trend  removal.  This  program  was  written  in  C-I-+  by  Donald  Duckett  (8). 

3.1.3  The  Semivariogram  There  are  two  activities  involved  with  determin¬ 
ing  the  semivariogram;  calculating  the  experimental  semivariogram  and  calculating 
the  parameters  of  the  theoretical  semivariogram  model.  These  two  activities  are 
addressed  separately. 

Experimental  Semivariogram.  The  data  may  not  benefit  from  par¬ 
titioning  and  may  not  have  a  global  trend.  In  that  case,  the  only  step  to  be  performed 
on  the  data  is  the  determination  of  the  experimental  semivariogram  and  fitting  a 
model  to  that  experimental  semivariogram  data. 

The  first  task  of  this  step  is  to  determine  the  pairs  of  points  that  are  a  multiple 
of  Lh  apart  in  the  directions  of  interest  to  the  user,  usually  0°  (positive  y  direction) 
and  90°  (positive  x  direction).  The  determination  of  the  experimental  semivariogram 
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in  each  direction  of  interest  is  performed  sequentially.  Typically,  data  are  known 
either  at  regularly  spaced  grid  points  or  at  irregularly  distributed  points  throughout 
the  region  of  study.  Data  is  assumed  to  be  available  in  the  form  Z{xi),  where  Z{x{) 
is  the  veilue  of  a  random  function  at  vector  location  x,-. 

For  points  that  are  irregularly  distributed  throughout  the  region  of  study,  the 
method  of  determining  the  pairs  of  points  is  more  involved  and  requires  more  compu¬ 
tation  time.  The  angle  and  distance  between  each  pair  of  points  must  be  calculated. 
Those  pairs  of  points  that  are  in  the  direction  of  interest,  q,  and  at  a  multiple  of  Ah 
apart  are  used  in  the  determination  of  the  experimental  semivariogram.  Since  points 
may  not  be  in  exactly  the  direction  a,  the  user  is  allowed  to  specify  a  semi-inclusion 
angle,  or  regularization  angle,  If  the  angle  between  a  pair  of  points,  Z{x{)  and 
Z{xj),  falls  within  the  range  a±<f)  then  that  point  is  considered  to  be  in  the  direction 
of  interest.  If  the  pair  of  points  are  to  be  used  in  the  calculation  of  the  experimental 
semivariogram  then  the  following  two  actions  are  performed:  1)  the  square  of  the 
difference  of  the  quantity  of  interest  of  that  pair  is  calculated,  {Z{xi)  -  Z{xj))^,  and 
added  to  an  array  based  on  the  distance  between  the  points;  and,  2)  the  location  in 
the  array  containing  the  number  of  pairs  of  points  integer  multiples  of  Ah  apart  is 
incremented  based  on  the  distance  between  the  points.  These  two  arrays  are  used 
later  in  the  calculation  of  the  experimental  semivariogram. 

The  calculation  of  the  experimental  semivariogram  is  simpler  when  the  data  are 
aligned  in  a  grid  structure  because  the  coordinates  of  each  data  point  are  a  multiple 
of  some  incremental  distance  Ah  apart  and  the  0°  and  90®  directions  are  easily 
determined.  The  determination  of  the  pairs  of  points  Ah  apart  in  the  0°  and  90° 
directions  reduces  to  scanning  the  data  in  the  y  and  x  directions  and  calculating  the 
square  of  the  difference  of  the  quantity  of  interest  of  that  pair.  The  two  summing 
arrays,  mentioned  previously,  are  updated  for  use  later  in  the  calculation  of  the 
experimental  semivariogram. 
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The  next  step  is  to  calculate  the  experimental  semivariogram,  described  in 
Chapter  II,  using  the  two  summing  arrays  and  the  formula; 

T  \N\ 

where  \N\  is  the  number  of  pairs  of  data  values  at  a  distance  of  h  apart  from  one 
another,  Xi  is  the  location  of  point  i,  Xj  +  h  is  the  location  of  a  point  at  distance  h 
from  i,  and  x(x,)  and  z{xi  +  h)  are  the  values  of  the  quantity  of  interest  at  i  and 
Xi  +  h. 

The  experimental  semivariogram  is  a  discrete  function  and  can  not  be  used 
at  arbitrary  locations  in  the  data  field.  The  theoretical  variogram,  however,  is  a 
continuous  function.  Therefore  the  last  step  is  to  calculate  the  required  parameters 
for  the  theoretical  semivariograun. 

Theoretical  Semivariogram.  Three  theoretical  semivariograms  are 
modeled  using  the  experimental  semivariogram  calculated  by  the  method  outlined 
in  the  previous  section.  The  three  models  are  the  linear,  De  Wijsian,  eind  spherical 
models  mentioned  previously.  The  parameters  of  the  theoretical  semivariograms 
are  estimated  using  a  weighted  least  squares  regression  technique.  The  program  to 
calculate  both  the  experimental  and  theoretical  semivariograms  is  written  in  C++ 
by  Dr.  David  Robinson  with  modifications  by  Donald  Duckett  and  Wayne  McGee 
(17). 


3.1.4  Data  Set  Reconstruction  After  calculating  an  estimate  of  the  surface, 
based  on  the  input  data  set,  any  removed  trend  must  be  added  and  any  partitioned 
data  must  be  reassembled. 

Trend  Addition.  The  global  trend  is  added  using  a  polynomial  of  the  form: 
Ai-  Bx  +  Cy-\-  Dx^  +  Ey^  +  Fxy  =  z 
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where  the  coeflScients  were  previously  calculated  by  the  trend  removal  program.  The 
program  uses  the  common  interface  routines  to  get  control  information,  including  the 
polynomial  coefficients,  from  the  control  file  and  to  read  in  and  write  out  the  data. 
The  trend  must  be  added  to  the  data  before  or  after  reassembly  of  the  partitions 
bcised  on  when  it  was  removed.  This  program  is  written  in  C++  by  Wayne  McGee 
(14). 

Partition  Assembly.  If  the  data  is  partitioned  it  must  then  be  manually  re¬ 
assembled.  A  program  to  do  this  was  not  written  due  to  lack  of  time. 

3.2  Kriging 

To  perform  the  tasks  required  by  this  thesis  effort  and  the  two  concurrent  thesis 
efforts,  a  kriging  program  originally  written  in  C  by  Michael  Grant  (9)  was  rewritten 
in  C++  with  extensive  modifications.  The  modifications  correct  some  programmatic 
errors  as  well  is  incorporate  geometric  anisotropy,  minimal  data  set  production,  and 
generalization  to  any  data  set  as  required  for  satisfaction  of  the  objective  outlined 
in  Chapter  I.  These  routines  are  documented  in  Appendix  E,  The  Programmers 
Manual.  Note  that  a  point  is  “kriged”  if  the  estimate  and  the  estimation  variance 
at  that  point  are  calculated. 

3.3  Minimization 

Minimization  is  the  process  of  selecting  a  subset  of  the  original  data  set  subject 
to  some  constraint  on  the  reconstruction  process.  For  this  effort  that  constraint  is 
the  maximum  acceptable  level  of  error  set  by  the  user. 

3.3.1  Background  The  resolution  is  the  number  pixels  (picture  elements)  that 
comprise  the  picture.  More  pixels  means  better  resolution  and  as  technology  ad¬ 
vances  picture  resolution  increases.  Mass  production  also  allows  the  cost  of  the 
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higher  resolution  technology  to  decrease.  These  two  facts  have  allowed  picture  pro¬ 
ducing  devices  to  invade  almost  all  aspects  of  every  day  life. 

But  these  advances  are  not  without  their  price.  Higher  resolution  means  in¬ 
creased  information  storage  and  transfer  requirements.  For  example,  a  picture  with 
a  resolution  of  1024  pixels  by  1024  pixels  with  3  bytes  (24  bits)  per  pixel,  1  byte  for 
each  color  component  (red,  green,  and  blue),  needs  3  megabytes  of  information  saved 
or  transferred  for  each  frame  of  that  picture.  Current  telephone  lines  and  equipment 
do  not  have  sufficient  bandwidth  to  transfer  this  amount  of  information  in  anything 
approaching  real  time  as  would  be  needed  for  teleconferencing.  The  standards  they 
were  based  on  did  not  envision  or  allow  for  video  information  transfer. 

Storing  these  huge  information  files  will  cost  a  large  amount  of  time  and  trans¬ 
ferring  then  from  place  to  place  is  difficult.  The  soluti  :>n  ■>  o  both  problems  is  some 
sort  of  data  compression  or  minimization.  Data  compresL'ion  te^finiques  wreak  great 
violence  on  the  data  by  taking  advantage  of  redundancies  in  tli :  data  and  the  non¬ 
linear  operation  of  the  human  eye.  By  exploiting  correlation  in  space  of  still  images 
and  separate  frames  of  video  data,  and  correlation  in  time  between  frames  for  video 
data,  large  compression  gains  can  be  realized.  For  still  images  compression  ratios  of 
10:1  to  50:1  (2)  can  be  tichieved.  For  video  data  compression  ra  ios  of  50:1  to  200:1 
(2)  can  be  achieved.  But  these  methods  are  lossy  in  that  the  rt  :onstructed  images 
are  quantitatively  and  qualitatively  different  than  the  original,  'i  his  and  other  lossy 
techniques  discard  much  of  the  original  data  and  rely  on  the  hun  an  eye  to  interpret 
the  reconstructed  image  properly.  This  is  acceptable  in  many  ap-  ’ications  but  some, 
such  as  medical  imaging,  must  have  high  quality  reconstructior, . 

For  those  applications  that  require  that  there  be  little  oi  no  qualitative  dif¬ 
ference  between  the  original  and  reconstructed  image  a  diffc  «.ent  technique  must  be 
used.  The  compression  ratio  will  be  much  lower  for  these  data  s-^t-s,  as  low  as  2:1 
for  very  high  quality  image  reconstruction  and  as  high  as  4:1  for  low  quality  image 
reconstruction.  This  research  effort  is  aimed  at  satisfying  the  needs  of  those  applica- 
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tions  that  require  high  quedity  image  reconstruction.  The  following  section  describes 
several  proposed  methods  of  statistically  minimizing  this  type  of  data. 

3.3.2  Szidarovszky’s  Methods  Szidarovszky  describes  a  number  of  methods 
on  how  data  set  minimization  can  be  accomplished  (18:334-336)  (3:193-195).  Six  of 
the  methods  is  presented  in  this  section. 

Method  1.  This  method  minimizes  the  estimation  vp’-iance  subject  to  a  given  number 
of  additional  measurement  locations  or  cost  (18:334).  This  model  assumes  there  are 
k  existing  known  points  and  that  n  —  k  additional  points  are  to  be  added  to  the  data 
set  such  that  their  choice  of  location  minimizes  the  estimation  variance.  If  there  are 
no  existing  known  points  then  k  =0.  All  remaining  points  are  aidded  to  the  data  set 
and  then  these  points  are  removed,  one  at  a  time,  with  the  data  set  being  tested  after 
each  removal  and  the  point  that  yields  the  lowest  estimation  variance  upon  removal 
is  discarded.  This  is  repeated  until  n  points  remain  in  the  data  set.  The  optimal 
selection  method  is  based  on  assuming  the  following  two  monotonic  properties  of  the 
estimation  variance:  1)  After  increasing  n  by  one  by  adding  the  point  x„+j  then: 

Var{n  -f  l,x„+i)  <  Var{n,x) 

and  2)  After  decreasing  n  by  one  by  removing  the  point  the  estimation  variance 
increases  based  on  the  previous  observation.  By  use  of  enumeration  in  a  tree  search 
procedure  the  optimal  data  set  can  be  selected.  Therefore,  this  method  starts  by 
adding  all  the  points  to  the  data  set  and  then  removing  them  one  by  one  until  there 
cire  n  points  remaining. 

Method  2.  This  method  minimizes  the  number  of  additional  measurement  locations 
or  cost  subject  to  upper  bounds  given  to  the  estimation  variances  (18:336).  This 
method  is  substantially  the  same  as  Method  1.  The  three  main  differences  are: 
1)  Start  with  no  additional  points  added;  2)  cidd  points  instead  of  removing  them; 
3)  Perform  the  test  M  <  c  at  each  new  node  where  c  is  the  upper  bound  on  the 
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estimation  variance  and  M  is  the  current  estimation  variance  of  the  data  set. 

Method  3.  This  method  determines  the  locations  of  a  fixed  number  of  measurement 
points  using  total  enumeration  and  ects  the  set  with  the  smallest  estimation  vari¬ 
ance  (3:193).  The  initial  data  set  contjuns  k  existing  data  points  and  n—k  additional 
points  are  to  be  added  to  this  set  from  N  avjulable  points.  Assuming  N  >  n  —  k  the 
total  data  set  to  be  tested  for  minimal  estimation  variance  is  given  by: 

m 

{N-n  +  k)l{n-k)\ 

Using  a  search  tree,  test  for  one  of  the  following  conditions  at  each  node:  1)  the 
number  of  elements  of  the  subset  of  the  data  to  be  added  to  the  data  set  equals 
n  —  k;  or  2)  all  nodes  which  are  endpoints  of  arcs  starting  from  this  node  have  all 
ready  been  searched.  If  either  conditions  holds  then  proceed  backward  from  this 
node,  otherwise  proceed  forward  to  the  next  point  which  has  not  been  searched. 
Moving  forward  is  equivalent  to  adding  a  point  to  the  set  while  moving  backward 
is  equivalent  to  removing  a  point  from  the  set.  The  initial  subset,  or  node,  has  no 
points. 

Method  This  method  determines  the  locations  of  a  fixed  number  of  measurement 
points  using  enumeration  constrained  by  a  branch  and  bound  procedure  and  selects 
the  set  with  the  smallest  estimation  variance  (3:193).  This  method  is  very  similar 
to  Method  3,  but  there  are  two  major  differences.  The  first  difference  is  in  the 
construction  of  the  search  tree.  The  initial  node,  or  subset,  contains  all  the  points 
available  to  add  to  the  data  set  and  moving  along  each  arc,  from  node  to  node, 
is  equivalent  to  removing  one  point  from  the  subset.  The  second  difference  is  an 
additional  condition  that  must  also  be  checked:  the  current  estimation  variance  is 
not  less  than  the  smallest  one  found  for  subsets  containing  exactly  n  —  k  points.  In 
this  case  removing  more  points  makes  the  estimation  variance  larger  than  a  .solution 
previously  determined. 
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Method  5.  This  method  starts  with  an  initial  data  set  and  sequentially  includes 
additional  measurements  at  the  locations  that  yield  the  smallest  estimation  variance 
and  terminates  upon  reaching  the  desired  number  of  points  (3:194).  This  method  is 
similar  to  Method  3  in  that  the  initial  subset  is  empty  but  the  selection  of  the  next 
point  to  add  to  the  subset  is  the  one  that  produces  the  smallest  estimation  variance. 
This  is  continued  until  n  —  k  points  have  been  added  to  the  subset. 


Method  6.  This  method  starts  with  an  initial  data  set  containing  the  desired  num¬ 
ber  of  points  and  sequentially  exchanges  points  from  this  data  set  with  points  not 
included  this  data  set  keeping  the  exchanges  that  yield  the  smallest  estimation  vari¬ 
ance  (3:195).  For  this  method  let  A'o  =  (^i,--*)<n-it)  and  Xi  =  ,<a?)  and 

j  =  1.  Try  to  exchange  the  element  of  Xq  systematically  with  the  elements  of 
Xi.  The  exchange  that  minimizes  the  estimation  variance  is  kept.  If  no  exchange 
can  decrease  the  estimation  variance  then  do  not  make  any  exchanges  and  modify  j 
as  follows: 

f  j  -I- 1  if  j  <n-  k 


J  =  S 


(  1  if  j  =  n  —  k 


and  try  exchanging  the  element  of  Xo  optimally.  The  procedure  terminates  when 
no  exchange  decrecises  ihe  estimation  variance. 


3.3.3  Brodkin’s  Method  The  method  chosen  for  implementation  for  this  effort 
is  a  combination  of  Method  2  and  Method  5  with  slight  modifications.  This  method 
minimizes  the  number  of  additional  points  to  add  to  an  initial  data  set  subject  to 
upper  bounds  given  to  the  estimation  variances.  Additional  points  are  sequentially 
included  at  the  locations  that  yield  the  smallest  estimation  variance.  Therefore,  to 
find  the  minimal  data  set,  the  following  steps  must  be  performed: 

Step  1.  Compute  the  estimate  and  estimation  variances  for  all  points  not 
included  in  the  initial  data  set. 
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Step  2.  Compare  the  largest  estimation  variance  with  the  maximum  allowed 
value  specified  by  the  user.  If  the  calculated  value  is  larger  than  the  required  maxi¬ 
mum  add  a  point  at  that  location  and  go  to  step  1,  otherwise  stop. 

Gridded  Data.  If  the  data  is  gridded,  that  is  regularly  spaced  with 
the  data  grid  fully  populated,  then  the  following  technique  is  developed  and  presented 
as  a  method  of  selecting  the  minimal  data  set.  The  method  takes  maximum  advan¬ 
tage  of  the  gridded  nature  of  the  data  to  minimize  the  computational  time.  Also,  if 
the  correct  initial  pattern  and  pattern  size  is  chosen,  the  minimal  data  set  generated 
will  be  optimal  or  barely  suboptimal. 

As  stated  previously  in  the  literature  review,  the  estimation  variance  at  a  point 
is  a  function  only  of  the  relative  distance  and  orientation  of  the  known  points  used 
to  estimate  that  point.  This  function  is  captured  in  the  theoretical  semivariogram 
model.  Therefore,  if  a  pattern  of  points  could  be  chosen  such  that  the  largest  esti¬ 
mation  variance  is  just  within  the  maximum  allowed  variance  then  that  pattern  of 
points  would  constitute  the  optimal  minimal  data  set.. 

The  method  required  for  minimizing  one  dimensional  gridded  data  will  help 
illustrate  the  mechanics  of  this  minimization  method  and  is  developed  first.  The 
method  will  then  be  extended  to  two  dimensional  gridded  data.  If  all  the  points  are 
aligned  on  an  axis  then  the  largest  estimation  variance  would  be  at  the  midpoint  of 
the  two  points  which,  without  any  intervening  points,  are  farthest  apart.  This  then 
reduces  the  problem  to  ensuring  that  the  distance  between  any  pair  of  consecutive 
points  is  just  less  than  that  which  produces  the  maximum  allowed  variance.  If 
the  points  are  closer  than  this  distance  then  the  data  set  may  not  be  minimal.  If 
the  points  are  farther  apart  than  this  distance  then  the  maximum  allowed  variance 
criteria  won’t  be  met. 

This  analogy  can  be  extended  to  multiple  dimensions  with  little  difficulty.  For 
this  effort,  a  two  dimensional  pattern  is  desired  such  that  the  maximum  allowed 


3-10 


Figure  3.1.  Basic  Kriging  Pattern 

variance  criteria  is  met.  Kriging  does  not  produce  good  results  when  a  point  beyond 
an  edge  of  the  data  field  is  estimated.  Therefore,  the  corner  points  of  the  area 
of  interest  must  be  included  in  the  minimal  data  set.  This  suggests  a  square  or 
rectangular  pattern  but  it  was  felt  that  including  the  center  point,  producing  an  ”X” 
pattern,  would  produce  better  results  by  needing  less  data  set  for  the  same  maximum 
allowed  variance.  Therefore,  the  "X”  pattern  based  on  the  semi-inclusion  distances 
shown  in  Figure  3.1  is  typical.  Where  “x”  is  a  known  point  and  represents  points 
to  be  estimated. 

At  this  point  the  semi-inclusion  distance  must  be  defined.  The  distance  be¬ 
tween  the  center  point  and  a  corner  point  must  be  less  than  the  semivariogram  range, 
a.  This  allows  any  unknown  point  within  the  square  to  be  estimated  based  on  no 
fewer  than  two  known  points,  the  center  point  and  one  or  more  corner  points.  The 
X  semi-inclusion  distance,  lAX,  is  then  defined  as  the  number  of  rows  between  the 
center  point  and  a  corner  point  plus  one.  The  y  semi-inclusion  distance,  lAY,  is  then 
defined  as  the  number  of  columns  between  the  center  point  and  a  corner  point  plus 
one.  The  distance  between  the  center  point  and  a  corner  point  must  not  be  larger 
than  the  semivariogram  range,  but,  it  may  be  substantially  smaller. 

If  the  inclusion  distance,  twice  the  semi-inclusion  distance,  is  too  large,  then 
the  largest  estimation  variance  will  be  larger  than  the  maximum  allowed  variance. 
Therefore,  a  point  will  need  to  be  added  to  the  data  set.  This  first  point  will  be 
added  at  the  midpoint  of  the  longest  edge  and,  due  to  symmetry,  a  point  will  also 
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Figure  3.2.  Replicated  Beisic  Kriging  Pattern 

be  added  to  the  other  longest  edge.  If  the  maximum  varitince  criteria  is  still  not  met 
then  two  more  points,  one  at  the  midpoint  of  each  remaining  edge,  will  be  added. 
If  the  maximum  variance  criteria  is  still  not  met  then  four  points,  due  to  symmetry, 
will  be  added,  one  at  each  of  the  midpoints  of  the  lines  joining  the  center  point  and 
each  corner  point.  This  will  result  in  the  pattern  shown  in  Figure  3.2.  Where 
represents  an  added  point  and  unknown  points  are  omitted  for  clarity.  It  can  be  seen 
that  the  original  pattern  is  now  replicated  four  times  within  the  original  data  field. 
If  the  semi-inclusion  distance  is  divisible  by  two  then  the  replicated  pattern  will  be 
identical  to  the  original  pattern  except  for  scale.  If  the  pattern  is  exactly  replicated 
the  size  of  the  problem  can  be  reduced  by  a  factor  of  four,  that  is  a  factor  of  two  in 
each  direction.  This  will  greatly  reduce  the  computational  time.  It  should  also  be 
noticed  that  the  weights  calculated  for  one  of  the  replicated  patterns  are  identical 
to  the  weights  calculated  for  the  other  replicated  patterns.  These  two  facts  are  used 
to  great  advantage  in  the  minimization  programming. 

In  order  to  ensure  that  the  estimation  variance  at  a  point  being  estimated  is 
minimized,  surrounding  points  must  be  included  in  the  kriging  pattern  if  possible. 
Therefore,  the  points  that  are  twice  the  semi-inclusion  distances  in  each  direction 
from  the  center  of  the  pattern  are  included  as  part  of  the  kriging  pattern  for  areas 
located  in  the  central  region  of  the  data  field.  This  results  in  the  pattern  shown  in 
Figure  3.3.  Where  “x”  is  a  known  point  within  the  area  being  estimated  and  “s”  is 
a  surrounding  point  included  to  minimize  the  estimation  variance.  Only  the  points 
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Figure  3.3.  Augmented  Kriging  Pattern 
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Figure  3.4.  Kriging  Areas 


bounded  by  lines  connecting  the  corner  “x”  points  are  being  estimated.  The  edges 
and  corners  must  be  estimated  separately  from  the  central  area  because  there  are 
no  points  beyond  the  edges  to  include  in  the  pattern  and  the  point  pattern  on  the 
edges  and  corners  will  not  be  identical  to  the  central  area  point  patterns.  If  the  point 
patterns  are  not  identical  then  neither  are  the  weights.  This  creates  nine  separate 
areas  to  krige  as  shown  in  Figure  3.4.  As  the  figure  illustrates,  the  nine  areas  are  the 
four  corners,  the  four  edges,  and  the  central  section.  For  clarity  it  is  not  shown,  but 
the  edges  of  each  kriging  area  overlaps  the  edges  of  two  or  more  other  kriging  areas. 
The  weights  only  need  to  be  calculated  once  for  one  kriging  area  for  the  edge  and 
central  sections  and  can  be  used  for  each  identically  patterned  area  in  that  section. 
This  will  mean  a  great  savings  in  computational  time. 

By  now  it  should  be  apparent  that  obtaining  the  optimal  minimal  data  set  is 
heavily  dependent  on  the  initial  values  of  the  semi-inclusion  distances.  If  these  values 
are  not  an  integer  multiple  of  the  ideal  values  then  too  many  points  will  be  added 
to  meet  the  maximum  allowed  variance  criteria.  Therefore,  an  iterative  routine  was 
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written  to  determine  these  ideal  values.  All  the  points  within  a  kriging  area  need 
not  be  estimated  to  determine  the  largest  estimation  variance.  Only  those  points 
within  a  small  area  located  at  the  maximum  distances  from  the  known  points  need 
to  be  estimated  to  determine  the  maximum  estimation  variance  within  that  kriging 
area.  This  method  for  determining  the  ideal  semi-inclusion  distances  also  produces 
a  large  savings  in  computational  because  fewer  points  are  added  and  hence  fewer 
points  need  to  be  estimated  again  based  on  the  added  points.  These  routines  are 
documented  in  Appendix  E,  The  Programmers  Manual. 

It  has  not  been  explicitly  stated  but  the  matrix  of  semivariogram  values  in¬ 
cludes  all  the  points  in  the  kriging  pattern,  even  those  that  are  outside  the  theoretical 
semivariogram  range  of  the  point  being  estimated.  If  this  were  not  so  then  a  new 
[A]  matrix  would  be  required  for  every  point  to  be  estimated  and  the  great  compu¬ 
tational  time  savings  would  be  lost.  Also  not  explicitly  stated  is  the  requirement  to 
add  points  to  the  entire  grid  such  that  the  current  kriging  area  pattern  is  replicated 
throughout  the  grid. 

Of  interest  (for  the  pattern  minimization  method)  is  the  implementation  of  a 
matrix  inversion  method  that  takes  advantage  of  the  prior  matrix  inversion.  This  is 
of  interest  due  to  the  fact  that  adding  points  to  the  pattern  adds  rows  and  columns 
to  the  [A]  matrix  and  points  are  added  based  on  estimation  variances  calculated  by 
inverting  the  prior  [A]  matrix.  This  routine  uses  a  matrix  partitioning  method  to 
reinvert  the  matrix  (19:192).  Given  the  two  matrices,  [A]  and  [.B],  partitioned  into 
submatrices  as  follows: 


A 


P  Q 
R  S 


Y  Z 
U  V 
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where  [yl]  •  [B]  =  [/]  and  P~^  is  known.  Performing  the  multiplication  it  can  be  seen 
that: 


P‘Y  +  Q‘U 

=  I 

(3.1) 

P‘Z  +  Q^V 

=  0 

(3.2) 

R-Y  +  S‘U 

=  0 

(3.3) 

RZ-¥S‘V 

=  I 

(3.4) 

solving  the  second  equation  for  Z: 

Z  =  -P-^.Q.V 

combining  this  with  the  fourth  equation  and  solving  for  V : 

V  =  {S-R-P-^  ^Q)-^ 
solving  the  first  equation  for  Y : 

Y  =  p-^-{I-Q-U)  =  -  (P-^  •  Q)  •  [/ 

combining  this  with  the  third  equation  yields: 

U  =  -(5-P<p-‘-Q)-^-P-P-'  =  -VR-P-^ 
therefore,  the  following  relationships  are  established: 

V  =  (5-P.p-* -Q)-' 

Z  =  -P~^  Q’V 

U  =  -V-RP-^ 

Y  =  P-' -  (P-*  •  Q)  •  {/ 
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If  the  number  of  added  rows  and  columns  is  small  relative  to  the  original  matrix  size, 
the  computational  time  saved  by  not  inverting  the  entire  matrix  is  substantial. 

S.4  Procedure  Application 

The  data  used  for  proof  of  concept  are  Magnetic  Resonance  Image  (MRI)  brain 
scans.  The  scans  are  comprised  of  several  parallel  planes  of  data  in  which  each  plane, 
or  slice,  is  268  measurements,  (pixels)  wide  and  267  measurements,  (pixels)  long. 
Each  measurement  is  an  intensity  represented  as  a  grey  scale  value  between  0  and 
255.  The  values  can  be  represented  by  1  byte  unsigned  integers.  Therefore,  one  brain 
scan,  containing  60  slices,  will  need  4  megabytes  of  disk  space.  This  type  of  data  is 
used  for  proof  of  concept  due  to  the  requirement  that  the  quality  of  the  reconstructed 
image  be  as  good  as  the  original  image.  Since  kriging  is  not  a  lossy  technique,  this 
type  of  data  is  ideally  suited  for  kriging  minimization.  For  this  application  the  pixel 
location  in  each  slice,  row  and  column,  are  used  as  the  coordinates  of  the  point  and 
the  grey  scale  value  is  used  as  the  quantity  of  interest  at  each  point. 

Using  the  procedures  developed  in  the  previous  sections,  minimal  data  sets 
for  various  maximum  variances  and  initial  data  patterns  are  found.  The  following 
summarizes  the  steps  taken  to  produce  the  minimal  aata  sets. 

Step  1.  Data  Partitioning.  Only  slice  24  is  partitioned  for  proof  of  concept. 
The  partitioning  of  slice  24  is  illustrated  in  Figure  3.5.  Not  all  the  data  sets  would 
benefit  from  partitioning.  The  partitions  that  the  data  set  may  be  divided  into  may 
be  too  small  to  appreciably  decrease  the  number  of  points  in  the  minimal  data  set 
and  is,  therefore,  not  worth  the  extra  effort  involved  in  partitioning  and  reeissembling 
the  data  set.  Slice  1  is  a  good  example  of  a  data  set  that  would  not  benefit  from 
partitioning.  The  central  image  would  be  in  one  partition  and  the  background  would 
be  in  four  very  thin  edge  partitions  and  four  very  small  corner  partitions.  This 
partitioning  may  actually  increase  the  number  of  points  in  the  minimal  data  set 
since  the  corner  points  of  each  partition  must  be  included  in  the  minimal  data  set 
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Figure  3.5.  Partitioning  of  Slice  24 


so  that  points  beyond  the  edge  of  the  data  set  aren’t  estimated. 

Step  8.  Trend  Removal.  Globed  trend  is  not  removed  from  any  of  the  data  sets 

(17). 

Step  3.  Semivariogram  Determination.  The  data  showed  both  geometric  and 
zonal  anisotropy  as  illustrated  in  Figure  3.6  and  Figure  3.7.  The  kriging  program 
can  accommodate  the  different  ranges  due  to  geometric  anisotropy  and  averages  the 
sills  to  remove  the  zonal  anisotropy  from  the  theoretical  semivariogr^un  model  (not 
from  the  data).  The  theoretical  spherical  semivariogram  model  is  plotted  with  the 
experimental  semivariogram  in  Appendix  B  for  the  0°  and  90°  directions  of  every 
slice  of  MRI  data  used. 
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Figure  3,7.  Slice  12  Semivariogram  in  the  90°  Direction 
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Step  4‘  Kriging  of  Data  Sets.  Using  the  kriging  program,  minimal  data  sets  for 
three  maximum  allowed  variance  leveL  and  five  initial  data  patterns  were  produced. 
The  three  maximum  allowed  variances  were  chosen  such  that  the  szune  values  could 
be  used  for  all  slices  without  causing  all  the  known  points  to  be  included  in  the 
minim2d  data  set.  A  second  requirement  was  that  the  values  chosen  would  produce 
recognizable  images  of  slice  one  with  a  minimum  difference  between  the  values  of 
fifteen.  This  was  accomplished  by  iteratively  kriging  all  the  slices  using  integer  values 
for  the  maximum  allowed  variances  until  the  desired  results  were  obtained.  The  five 
semi-inclusion  distances  were  chosen  by  requiring  a  square  pattern  and  counting  up 
from  one.  Therefore,  the  three  maximum  allowed  variances  were  100.0,  85.0,  and 
70.0.  The  five  initial  data  patterns  were  for  the  semi-inclusion  distances  of  1,  2,  3, 
4,  and  5.  The  partitioned  slice  was  minimized  with  meiximum  allowed  variances  of 
11.0,  12.0,  and  13.0.  These  values  were  chosen  such  that  the  image  quality  produced 
was  comparable  to  that  obtained  using  semi-inclusion  distances  of  1,  2,  and  3.  This 
was  also  accomplished  by  iteratively  kriging  the  data  set.  A  visual  inspection  was 
performed  to  determine  performance  of  the  procedure.  A  display  program  was  used 
to  display  the  slices  for  visual  inspection.  Photographs  of  the  original  and  estimated 
slices  are  in  Appendix  C.  The  reduction  percentage  and  largest  variances  of  the  first 
slice  are  shown  in  Table  3.1.  Tables  showing  the  percent  reduction  of  data  points,  the 
largest  variance,  and  the  photograph  number  of  the  results  of  all  the  kriged  minimal 
data  sets  are  in  Appendix  D. 

Step  5.  Trend  Addition.  Because  trend  is  not  removed  this  step  is  not  per¬ 
formed  for  these  data  sets. 

Step  6.  Partition  Assembly.  The  one  partitioned  slice  is  reassembled  before 
the  visual  inspection. 

The  steps  summarized  above  are  developed  in  the  previous  sections  and  demon¬ 
strate  the  process  used  in  applying  kriging  in  the  production  of  minimal  data  sets. 
These  steps  provide  the  methodology  for  obtaining  the  minimal  data  sets. 
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Table  3.1.  Slice  1  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Maximum 

o 

o 

,-1 

98.98 

98.70 

Allowed 

85 

84.39 

96.53 

Variance 

70 

68.37 

74.90 

Initial 

5,5 

91.56 

97.88 

Semi- 

4,4 

85.72 

96.77 

Inclusion 

iSI 

79.72 

94.34 

Distances 

2,  2 

73.50 

87.36 

lAX,  lAY 

1,1 

66.92 

50.00 

As  this  chapter  has  shown,  the  procedures  developed  can  produce  a  minimal 
data  set  in  a  statistically  controlled  manner.  The  controlling  parameter  is  the  max¬ 
imum  allowed  variance  set  by  the  user.  To  aid  the  kriging  program  in  obtaining  the 
best  results  a  trend  removal  program  and  a  partitioning  program  were  developed. 
To  capture  the  structure  of  the  data  set  in  continuous  function,  a  theoretical  semi- 
variogram  model  producing  program  was  developed.  To  replace  the  removed  global 
trend  a  rebuilding  program  was  developed. 
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IV.  Results  and  Conclusions 


This  chapter  includes  the  results  of  the  research  and  several  conclusions  based 
on  these  results.  As  previously  stated,  the  purpose  of  this  research  effort  was  to 
develop  a  procedure  that  could  select  a  minimal  data  set  in  a  statistically  controlled 
fashion  and  estimate  unknown  values,  at  regular  grid  locations,  based  on  a  user 
desired  overall  maximum  level  of  error.  This  goal  was  achieved.  The  following  results 
and  conclusions  are  provided  with  reference  to  the  objective  outlined  in  Chapter  I. 

4.1  Rtsidts 

In  general,  the  results  of  this  effort  are  the  procedures  developed  to  produce 
minimal  data  sets.  The  procedures  developed  in  Chapter  III  and  the  computer  pro¬ 
grams  written  for  this  effort  provide  the  means  for  producing  minimal  data  sets.  Nine 
slices  of  Magnetic  Resonance  Image  data  of  a  baby’s  brain  were  used  to  demonstrate 
use  of  the  procedures.  The  slices  used  are  numbered  1,  3,  6,  9,  12,  15,  18,  21,  and 
24.  These  provided  a  representative  cross  section  of  the  entire  data  set. 

As  mentioned  previously,  no  global  trend  was  removed.  Five  minimal  data 
sets  were  produced  for  each  slice  using  two  different  methods.  The  first  method 
produced  three  minimal  data  sets  based  on  maximum  allowed  variances  of  100.0, 
85.0,  and  70.0  gray  scale  values  squared.  The  second  method  produced  five  minimal 
data  sets  based  on  semi-inclusion  distances  for  (IAX,IAY)  of  (5,5),  (4,4),  (3,3),  (2,2), 
and  (1,1).  The  three  maximum  allowed  variances  were  chosen  such  that  the  same 
values  could  be  used  for  all  slices  without  causing  all  the  known  points  to  be  included 
in  the  minimal  data  set.  A  second  requirement  was  that  the  values  chosen  would 
produce  recognizable  images  of  slice  one  with  a  minimum  difference  between  the 
values  of  fifteen  gray  scale  values  squared.  This  was  accomplished  by  iteratively 
kriging  all  the  slices  using  integer  values  for  the  maximum  allowed  variances  until 
the  desired  results  were  obtained.  The  five  semi-inclusion  distances  were  chosen  by 
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requiring  a  square  pattern  and  counting  up  from  one.  For  the  data  sets  produced 
using  the  initial  point  pattern  based  on  the  semi-inclusion  distances  the  maximum 
allowed  variance  wa^  set  to  a  value  large  enough  to  ensure  that  no  points  would  be 
added  to  the  data  set.  Slice  24  was  also  partitioned  into  a  background  and  a  center 
data  partition  and  three  minimal  data  sets  were  produced  using  maximum  allowed 
variances  of  lo.O,  12.0,  and  11.0  gray  scale  values  squared.  These  values  were  chosen 
such  that  the  image  quality  produced  was  comparable  to  that  obtained  using  semi¬ 
inclusion  distances  of  1,  2,  and  3.  This  was  also  accomplished  by  iteratively  kriging 
the  data  set. 

For  this  effort,  the  background  of  each  image,  which  is  normally  filtered  out 
by  the  display  program,  was  considered  to  be  known  data  along  with  the  brain  scan 
data. 

Figure  4.1  and  Figure  4.1  are  plots  of  the  experimental  and  theoretical  semi- 
variograms  for  slice  one  for  the  0°  and  90®  directions  and  are  typical  for  all  the  slices. 
The  data  has  both  geometric  and  zonal  anisotropy.  The  plots  of  the  experimental 
and  theoretical  spherical  semivariograms  for  the  0°  and  90®  directions  for  all  slices, 
including  the  partitioned  slice,  are  in  Appendix  B. 
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iogram  in  the  0°  Direction 


As  evidenced  by  Figure  4.3  and  Figure  4.4,  minimizing  each  slice  based  on 
the  same  maximum  allowed  variance  did  not  produce  the  same  quality  of  image 
for  all  slices.  For  the  same  maximum  allowed  vauriance  of  70.0,  slice  9  looks  like 
a  brain  scan  whereas  slice  24  looks  like  an  indistinct  blob.  However,  Figure  4.5 
and  Figure  4.6  show  that  minimizing  based  on  initial  semi-inclusion  distances  did 
produce  comparable  quality  images  for  all  slices.  This  is  due  to  the  fact  that  the 
estimation  variance  is  directly  proportional  to  the  sill,  C  -f  CO,  minus  the  nugget 
effect,  CO,  of  the  theoretical  semivariogram  model.  Therefore,  the  slices  with  smaller 
sill  minus  nugget  effect  values  will  have  fewer  points  in  the  minimal  data  set  for  the 
same  maximum  allowed  variance.  With  fewer  known  points  from  which  to  estimate 
the  surface,  the  estimated  surface  is  noticeably  different  than  the  original  image. 
By  using  the  semi-inclusion  distances  to  determine  the  minimal  data  set,  the  same 
number  of  points  was  used  to  estimate  the  surface  for  each  slice.  Thus  the  difference 
between  the  original  image  and  the  estimated  surface  was  much  less  noticeable  for 
these  minimal  data  sets.  Therefore,  by  using  semi-inclusion  distances  of  (1,1),  half 
the  data  may  be  removed  with  barely  noticeable  differences  between  the  original 
and  estimated  images.  By  using  partitioning  on  slice  24,  as  shown  in  Figure  4.7,  a 
further  reduction  of  16.14  percent  in  the  number  of  points  was  achieved  with  out  a 
reduction  in  the  quality  of  the  image.  The  partitioned  image  is  shown  in  Figure  4.8. 
Photographs  of  the  results  of  the  kriging  procedures  for  all  slices  are  in  Appendix  C. 
Each  series  of  images  has  the  original  data  set  on  the  left,  the  kriged  image  in  the 
center,  and  the  error  between  the  original  and  kriged  image  on  the  right. 
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Figure  4.3.  Slice  9,  Maximum  Variance  =  70 
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Figure  4.4.  Slice  24,  Maximum  Variance  =  70 
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Figure  4.7.  Slice  24  Partitioning 
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Table  4.1  and  Table  4.2  show  the  statistics  for  slice  9  and  slice  24.  The  magni¬ 
tudes  of  the  numbers  in  the  tables  are  typical  for  all  slices.  The  typical  run  times,  for 
the  minimal  data  sets  produced  using  the  kriging  pattern,  were  under  ten  minutes. 
The  run  times  for  the  minimal  data  sets  that  were  not  able  to  take  advantage  of  the 
kriging  pattern  were  as  high  as  two  hours.  The  statistics  for  all  slices  are  tabulated 
in  Appendix  D. 
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Table  4.1.  Slice  9  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

Allowed 

Variance 

100 

98.69 

98.33 

00:45:04 

10(a) 

85 

82.09 

93.63 

00:39:55 

10(b) 

70 

69.18 

25.09 

01:34:58 

10(c) 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

5,  5 

95.44 

97.88 

00:03:33 

11(a) 

4,4 

89.79 

96.77 

00:03:46 

11(b) 

3,  3 

83.99 

94.34 

00:04:42 

11(c), 12(a) 

2,  2 

77.97 

87.36 

00:05:51 

12(b) 

1,1 

71.61 

50.00 

00:19:48 

12(c) 

Table  4.2.  Slice  24  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Mtiximum 

Allowed 

Variance 

100 

53.15 

99.96 

01:42:16 

25(a) 

85 

53.15 

99.96 

02:02:15 

25(b) 

70 

■iSireM 

99.96 

01:41:20 

25(c) 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

5,  5 

3.94 

97.88 

00:04:06 

26(a) 

4,4 

3.15 

96.77 

00:04:25 

26(b) 

3,  3 

2.36 

94.34 

00:05:18 

26(c), 27(a) 

2,2 

1.57 

87.36 

■iinnifaM 

27(b) 

1,1 

0.79 

IIHS3I 

00:20:29 

27(c) 
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Not  shown  in  the  appendices  is  the  high  degree  of  automation  achieved.  All 
the  figures,  tables,  and  photographs  can  be  recreated  in  under  two  days  using  one 
Silicon  Graphics  220  work  station.  This  was  achieved  by  generating  all  the  required 
control  files  and  running  the  programs  in  batch  mode  using  a  shell  script. 

The  initial  minimal  data  set  points  produced  indicated  that  a  square  or  rectan¬ 
gular  pattern  is  better  than  an  “X”  pattern  for  producing  the  optimal  minimal  data 
set.  This  fact  was  used  to  develop  the  CalculateJAXJAY  routine  to  determine  the 
ideal  initial  semi-inclusion  distances.  The  data  sets  were  then  kriged  again  to  pro¬ 
duce  the  optimal  minimal  data  sets  presented  in  this  effort.  Inspection  of  the  tables 
in  Appendix  D  reveals  that,  except  for  slice  24,  the  largest  estimation  variance  of  all 
the  minimal  data  sets  produced  was  within  3.2  percent  of  the  maximum  allowed  vari¬ 
ance.  Due  to  the  very  low  sill  minus  nugget  effect  of  the  theoretical  semivariogram 
model  for  slice  24,  which  is  directly  related  to  the  calculated  estimation  variance, 
the  largest  distance  allowed  for  the  semi-inclusion  distances  was  not  large  enough  to 
bring  the  calculated  estimation  variance  up  to  the  maximum  allowed  variance.  The 
limit  on  the  semi-inclusion  distances  allows  all  points  that  are  to  be  estimated  to  be 
within  range  of  two  or  more  knowm  points. 

4.2  Conclusions 

In  conclusion,  this  thesis  develops  and  demonstrates  the  application  of  kriging 
in  the  controlled  minimization  of  large  data  sets.  Specifically,  the  following  were 
shown:  1)  A  minimal  data  set  can  be  selected  based  on  a  maximum  acceptable  level 
of  error;  2)  Minimal  data  sets  based  on  the  semi-inclusion  distances  produced  more 
uniform  quality  in  the  reconstructed  images  for  each  slice;  3)  For  gridded  data  a 
rectangle  pattern  of  known  points  produces  a  minimal  data  set  that  is  closer  to  the 
optimal  minimal  data  set  than  an  “X”  pattern  of  known  points;  4)  For  large  data  sets 
(more  than  71,000  points)  containing  gridded  data  the  run  times  are  very  reasonable 
(typically  under  ten  minutes);  5)  Performing  trend  removal  and  partitioning  on  the 
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data  can  substantially  improve  the  results;  6)  Minimal  data  set  selection  can  be  fully 
automated;  and  7)  This  procedure  can  be  applied  to  any  type  of  data. 

In  achieving  the  goal  of  this  thesis,  two  objectives  were  accomplished.  First,  a 
viable  kriging  procedure  was  developed.  This  kriging  procedure  included  the  struc¬ 
tural  analysis  of  the  data  and  the  development  of  a  universal  kriging  program  for 
estimating  the  surfaces  and  the  variances.  Secondly,  the  procedures  were  demon¬ 
strated  using  Magnetic  Resonance  Image  data  and  for  this  data  a  2:1  compression 
ratio  produced  barely  noticeable  differences  in  the  reconstructed  image  over  the  orig¬ 
inal  image.  Further  research  in  this  ar®a  needed,  especially  in  the  application  of  these 
procedures  to  three  dimensional  data. 
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V.  Recommendations 


This  chapter  provides  recommendations  which  suggest  either  improvements  in 
this  effort  or  areas  for  further  research  related  to  this  study.  As  this  thesis  effcit  may 
very  well  be  the  first  automated  application  of  kriging  in  the  controlled  minimization 
of  large  data  sets,  further  research  in  this  area  may  prove  promising.  Recommenda¬ 
tions  are  provided  for  all  areas  of  this  study  and  are  presented  for  consideration. 

5.1  Structural  Analysis 

A  study  concerning  the  sensitivity  of  the  kriging  procedures  to  the  theoretical 
nugget  effect  should  determine  the  robustness  of  the  variogram  structure. 

A  more  sophisticated  partitioning  method  would  produce  better  results  in  im¬ 
ages  in  which  the  anomalous  feature  is  not  aligned  with  the  axis  of  the  image. 

5.2  Kriging 

Extending  kriging  to  three  dimensions  should  be  fairly  straight  forward  and  of 
great  interest  and  benefit  to  many  people.  By  using  known  points  in  adjacent  slices 
then  estimation  variance,  and  thus  the  number  of  points  in  the  minimal  data  set, 
could  be  further  minimized.  This  technique  could  also  be  applied  to  other  types  of 
three  dimensional  data. 

Since  kriging  involves  using  kno\/n  points  and  not  previously  kriged  points  the 
procedure  can  be  highly  vectorized.  Therefore,  running  the  program  on  a  vectorizing 
machine  may  provide  real  time  kriging. 

Allowing  area  specific  semivariograms  may  produce  tremendous  improvements 
in  image  quality  without  resorting  to  partitioning. 
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5.3  Minimization 


Coding  should  be  added  to  the  kriging  procedure  that  would  add  adjacent 
points  to  the  minimal  data  set  if  the  change  in  the  quantity  of  interest  is  above  a 
threshold  value  set  by  the  user.  This  would  improve  the  image  quality  while  reducing 
the  computational  time  otherwise  required  to  add  all  those  required  points. 

Coding  should  be  added  to  properly  minimize  irregularly  distributed  data. 
Some  method  of  finding  the  point  nearest  to  the  desired  location  of  point  addition 
needs  to  be  added  to  the  Add-Pts  routine  for  non-gridded  data. 

The  effects  of  calculating  the  true  variance  of  the  estimate  instead  of  the  theo¬ 
retical  variance  may  provide  better  results  in  image  quality  for  the  same  maximum 
variance  (at  the  cost  of  computational  time  since  the  pattern  may  be  lost).  This 
may  produce  minimal  data  sets  with  the  same  image  quality  give  the  same  maxi¬ 
mum  allowed  variance. 
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Appendix  A.  Users  Manual 


This  appendix  is  in  three  sections.  The  first  section  provides  the  program 
control  variable  names,  usage,  and  programs  actions  based  on  those  inputs.  The 
second  section  contains  two  sample  Krige  control  files.  The  third  section  contains 
a  sample  Varfit  control  file.  To  run  the  programs  type: 

Program  ame  Control  J'ile 

where  ProgramJSlame  is  Krige^  Varfit,  Rebuild,  Partition,  or  Residuals  and 
Control  J'ile  is  the  name  of  the  file  that  contains  the  program  control  variable  names 
and  their  associated  value.  The  interface  routines  may  be  modified  to  procur  the 
program  control  variables  from  anywhere  deemed  more  convenient  by  the  user  (8). 
The  programs  currently  reside  on  Poincare,  a  Silicon  Graphics  220  work  station, 
under  the  directory  /d2lcbrodkinfkrige-Code. 


A.l  The  Inputs 

The  input  names  cire  case  insensitive.  If  an  input  is  specific  to  one  program  then 
the  program  to  which  that  input  is  specific  is  given.  The  “=”  is  part  of  the  variable 
input  name  and  there  can  not  be  any  embedded  blanks  in  the  input  name.  The 
is  a  separator  between  the  variable  input  name  and  its  description.  Text  in  the 
control  file  not  recognized  as  a  program  control  variable  name  is  ignored.  Therefore, 
as  shown  in  the  sample  Krige  control  files,  comments  describing  the  input  may  be 
in  the  control  file. 

End.OfJnpuLValues  -  Signals  end  of  input.  Can  be  used  in  data  files  contain¬ 
ing  input  in  a  header  section.  The  version  of  the  interface  used  for  this  thesis  effort 
does  not  allow  a  data  file  header  to  contain  program  input. 

Data.Filename=  -  Specifies  the  name  of  the  file  containing  the  input  data.  Can 
not  be  longer  than  seventy  characters.  If  not  provided  the  programs  will  print  an 
error  message  and  terminate. 

Output-Filename^  -  Specifies  the  name  of  the  file  in  which  to  write  the  output 
data.  Can  not  be  longer  than  seventy  characters.  If  not  provided  then  no  output  to 
this  file  will  be  written. 

Variance-Filename=  -  Specifies  the  name  of  the  file  in  which  to  write  the 
est’mation  variance  data.  Only  applies  to  the  Krige  program.  Can  not  be  longer 
than  seventy  characters.  If  not  provided  then  no  output  to  this  file  will  be  written. 

MinimaLDataset-Filename=  -  Specifies  the  name  of  the  file  in  which  to  write 
the  minimal  data  set.  Only  applies  to  the  Krige  program  doing  minimization.  Can 
not  be  longer  than  seventy  characters.  If  not  provided  then  no  output  to  this  file 
will  be  written. 

PloLFilename=  -  Specifies  the  base  name  of  the  files  in  which  to  write  the 
experimental  semivariogram  data  in  two  column  format  suitable  for  plotting.  A  “O’* 
is  postpended  for  the  file  containing  the  0®  direction  experimental  semivariogram 
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data.  A  “90”  is  postpended  for  the  file  containing  the  90“  direction  experimental 
semivariogram  data.  Only  applies  to  the  Varfit  program.  Can  not  be  longer  than 
seventy  characters.  If  not  provided  then  no  output  to  these  files  will  be  written. 

VariogramJF'ilename=  -  Specifies  the  name  of  the  file  in  which  to  write  the 
experimental  semivariogram  data  and  the  parameters  of  the  spherical,  linear,  and 
dewijsian  theoretical  semivariogram  models.  Only  applies  to  the  Varfit  program. 
Can  not  be  longer  than  seventy  characters.  If  not  provided  then  no  output  to  this 
file  will  be  written. 

Error. Filename=  -  Specifies  the  name  of  the  file  in  which  to  write  any  program 
error  messages.  Can  not  be  longer  than  seventy  characters.  If  not  provided  then 
output  will  be  directed  to  the  screen. 

A(0)=  -  Specifies  the  theoretical  semivariogram  range  in  the  0°  direction. 

A  (90)=  -  Specifies  the  theoretical  semivariogram  range  in  the  90“  direction. 

C(0)=  -  Specifies  the  spherical  model  theoretical  semivariogram  sill  minus 
nugget  effect  in  the  0°  direction. 

C(90)=  -  Specifies  the  spherical  model  theoretical  semivariogram  sill  minus 
nugget  effect  in  the  90“  direction. 

C0(0)=  -  Specifies  the  spherical  model  theoretical  semivariogram  nugget  effect 
in  the  0“  direction. 

C0(90)=-  Specifies  the  spherical  model  theoretical  semivariogram  nugget  effect 
in  the  90“  direction. 

SphericaLCnrrdation(0)=  -  Specifies  the  spherical  model  theoretical  semivari¬ 
ogram  simple  correlation  in  the  0“  direction. 

SphericaLCorrelaHon(90)=  -  Specifies  the  spherical  model  theoretical  semivar¬ 
iogram  simple  correlation  in  the  90“  direction. 
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LinearJB0(0)=  -  Specifies  the  linear  model  theoretical  semivariogram  BO  coef¬ 
ficient  in  the  0°  direction. 

Linear^0(90)—  -  Specifies  the  linear  model  theoretical  semivariogram  BO  co¬ 
efficient  in  the  90°  direction. 

Linear^  1(0)=  -  Specifies  the  linear  model  theoretical  semivariogram  Bl  coef¬ 
ficient  in  the  0°  direction. 

Linear (90)=  -  Specifies  the  linear  model  theoretical  semivariogram  Bl  co¬ 
efficient  in  the  90°  direction. 

Linear^Correlation(0)=  -  Specifies  the  linear  model  theoretical  semivariogram 
simple  correlation  in  the  0°  direction. 

Linear. CorTelation(90)=  -  Specifies  the  linear  model  theoretical  semivariogram 
simple  correlation  in  the  90°  direction. 

Dewijsian.B0(0)=  •  Specifies  the  dewijsian  model  theoretical  semivariogram 
BO  coefficient  in  the  0°  direction. 

Dewijsian.B0(90)=  -  Specifies  the  dewijsian  model  theoretical  semivariogram 
BO  coefficient  in  the  90°  direction. 

Dewijsian.Bl(0)=  -  Specifies  the  dewijsian  model  theoretical  semivariogram 
Bl  coefficient  in  the  0°  direction. 

Dewijsian.Bl(90)=  -  Specifies  the  dewijsian  model  theoretical  semivariogram 
Bl  coefficient  in  the  90°  direction. 

Dewijsian.Correlation(0)=  -  Specifies  the  dewijsian  model  theoretical  semivar¬ 
iogram  simple  correlation  in  the  0°  direction. 

Dewijsian.Correlation(90)=  -  Specifies  the  dewijsian  model  theoretical  semi¬ 
variogram  simple  correlation  in  the  90°  direction. 

Xstcp=  -  Specifies  the  step  size  in  the  x  direction  for  semivariogram  calculation 
of  gridded  data.  Only  applies  to  the  Varfit  program. 
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Ystep=  -  Specifies  the  step  size  in  the  y  direction  for  semivariogram  calculation 
of  gridded  data.  Only  applies  to  the  Varfit  program. 

Xmax=  -  Specifies  the  maximum  x  coordinate  value  of  the  data  set. 

Xmin=  -  Specifies  the  minimum  x  coordinate  value  of  the  data  set. 

Ymax=  -  Specifies  the  maximum  y  coordinate  value  of  the  data  set. 

Ymin=  -  Specifies  the  minimum  y  coordinate  value  of  the  data  set. 

Tolerance=  -  Specifies  the  minimum  distance  between  points  to  consider  then 
as  distinct.  If  two  points  are  closer  than  this  value  only  the  first  point  is  kept  and 
the  second  point  is  discarded.  This  is  done  so  that  a  singular  matrix,  in  the  kriging 
system  of  equations,  does  not  result.  Only  applies  to  the  Krige  program. 

Maximum-Variance=  -  Specifies  the  maximum  error  variance  allowed  in  the 
selection  of  a  minimal  data  set.  Used  only  if  minimizing.  Based  on  this  value  the 
program  calculates  the  following  value  based  on  the  following  equation: 


Largest  J)if  ference  =  ConfidenceJnterval  •  V Maximum.V  ariance 
Only  applies  to  the  Krige  program. 

Large$LDifference=  -  Specifies  the  desired  maximum  difference  between  the 
estimated  value  and  the  actual  value  at  each  point.  To  be  used  in  the  selection 
of  a  minimal  data  set.  Used  only  if  minimizing.  Beised  on  this  value  the  program 
calculates  the  following  value  based  on  the  following  equation: 


Maximum. Variance 


(  Largest  J)i f ference  \  ^ 
Con  fidence.1  nterval  j 


Only  applies  to  the  Krige  program. 

Confidence.Interval=  -  Specifies  the  desired  statistical  confidence  interval.  To 
be  used  in  the  selection  of  a  minimal  data  set.  Used  only  if  minimizing.  Based  on  this 
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value  the  program  calculates  the  following  values  based  on  the  following  equations 
in  the  order  shown: 

Maximum^ ariance  =  Old-Variance  • 

Largest  JDif  f  erence  =  Con  fidenceJnterval  •  \/ Maximum. Variance 

Where  01d_Value  is  the  prior  value  of  Confidence-Interval  and  Old-Variance  is  the 
prior  vaJue  of  M^lximum-Variance.  Only  applies  to  the  Krige  program. 

Polynomial=  -  Specifies  the  five  coefficients  of  the  polynomial  used  for  trend 
removal.  The  coefficients  are  speciffied  for  the  following  polynomial  in  the  order 
shown: 

A  +  Bx  +  Cy  +  Dxy  +  Ex^  +  Fy^  =  z 
Only  applies  to  the  Rebuild  program. 

Var.Psi=  -  Specifies  the  semi-inclusion  angle,  regularization  factor,  for  non- 
gridded  data  in  the  semivariogram  calculations.  If  the  angle  of  one  point  relative  to 
a  second  point  is  in  the  desired  direction  for  semivariogram  calculation  and  the  points 
are  a  multiple  of  Var_Step  apart  then  that  pair  of  points  is  used  in  the  calculation. 
The  angle  is  allowed  to  vary  from  the  desired  direction  by  Var_Psi.  Only  applies  to 
the  Varfit  program. 

Var-Step=  -  Specifies  the  step  size  for  non-gridded  data  in  the  semivariogram 
calculations.  If  the  angle  of  one  point  relative  to  a  second  point  is  in  the  desired 
direction  for  semivariogram  calculation  and  the  points  are  a  multiple  of  Var_Step 
apart  then  that  pair  of  points  is  used  in  the  calculation.  Only  applies  to  the  Varfit 
program. 

Number-Of.X-Incrtmenls=  -  Specifies  the  number  of  grid  points  in  the  x  di¬ 
rection  at  which  to  estimate  surface  values  and  estimation  variances  for  the  Krige 
program.  For  the  other  programs  or  if  expanding  the  image  using  Expansion -Factor, 


I 


Old-Value 


y 


\C  on  fidenceJnterval ) 
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it  is  the  number  of  grid  points  in  the  x  direction  of  the  original  gridded  data.  If  ex¬ 
panding  the  image  only  grid  points  between  known  points  are  added  and  not  points 
beyond  the  edge  of  the  data  set.  The  kriging  program  has  a  default  value  of  one 
for  the  expansion  factor  and  calculates  the  number  of  expanded  grid  locations  as 
follows: 

Number jOf ^  Jncrements  =  {NumberjOf.XJncrements  —  1)  • 

Expansion  J' actor  -t- 1 

Number,Of.YJncrements=  -  Specifies  the  number  of  grid  points  in  the  y  di¬ 
rection  at  which  to  estimate  surface  values  and  estimation  variances  for  the  Krige 
program.  For  the  other  programs  or  if  expanding  the  image  using  Expansion  JFactor, 
it  is  the  number  of  grid  points  in  the  y  direction  of  the  original  gridded  data.  If  ex¬ 
panding  the  image  only  grid  points  between  known  points  are  added  and  not  points 
beyond  the  edge  of  the  data  set.  The  kriging  program  has  a  default  value  of  one 
for  the  expansion  factor  and  calculates  the  number  of  expanded  grid  locations  as 
follows: 

Number  JDf  Y  Jncrements  =  {Number.Of.Y  Jncrements  —  1)  • 

Expansion. Factor  -f  1 

Minimizing  -  Instructs  the  program  to  select  a  minimal  data  set.  The  initial 
data  set  is  based  on  an  “x”  pattern  using  the  semi-inclusion  distances  calculated 
from  the  range  or  set  by  the  user.  Only  aplies  to  the  Krige  program. 

Integer.Data  -  Informs  program  that  data  set  is  in  integer  format,  otherwise 
data  set  is  read  in  as  floats. 

TotaLNumbcr.Of.Points=  -  Specifies  the  total  number  of  points  in  the  data 
set.  Not  used  in  the  Krige  program. 


GriddtdJData  -  Informs  program  that  data  set  is  gridded  and  to  use  that  fact 
to  drastically  reduce  the  computational  time  required. 

Pattem.Output  -  Instructs  Krige  program  to  print  minimal  data  set  pattern 
using  an  “x”  for  an  included  point  and  a  for  an  excluded  point. 

Angles=  -  Specifies  the  angles,  in  degrees,  at  which  to  generate  semivariograms. 
Only  applies  to  the  Varfit  program. 

Maxlag=  -  Specifies  the  maximum  distance  to  use  as  termination  criteria  for 
semivariogram  generation.  Half  this  distance  is  then  used  in  the  determination  of 
the  theoretical  semivarioogram  models.  Only  applies  to  the  Varfit  program. 

IAX=  -  Specifies  the  semi-inclusion  distance  for  initial  point  inclusion  for  min¬ 
imal  data  set  generation  or  for  inclusion  in  estimation  for  normal  kriging  in  x  di¬ 
rection.  If  image  is  to  be  expanded  then  this  applies  to  the  original  grid  and  the 
program  multiplies  this  by  the  expansion  factor.  This  allows  the  input  to  be  based 
on  the  original  image  and  changing  the  amount  by  which  to  expand  the  image  is 
accomplished  by  changing  only  the  expansion  factor  input.  If  not  provided  then  this 
value  is  calculated  as  follows; 

f  _  N umber X>f.XJncrements  •  >1(0) 

M  mm 

{Xmax  —  Xmin)  •  v2 
Only  applies  to  the  Krige  program. 

IAY=  -  Specifies  the  semi-inclusion  distance  for  initial  point  inclusion  for  min¬ 
imal  data  set  generation  or  for  inclusion  in  estimation  for  normal  kriging  in  y  di¬ 
rection.  If  image  is  to  be  expanded  then  this  applies  to  the  original  grid  and  the 
program  multiplies  this  by  the  expansion  factor.  This  allows  the  input  to  be  based 
on  the  original  image  and  changing  the  amount  by  which  to  expand  the  image  is 
accomplished  by  changing  only  the  expansion  factor  input.  If  not  provided  then  this 
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value  is  calculated  as  follows: 


_  Number X>f-Y  Jncrements  •  A(0) 

{Y max  —  Ymin)  •  \/2 

Only  applies  to  the  Krige  program. 

Maximum.Points=  -  Specifies  the  maximum  number  of  points  to  allow  an 
estimate  to  be  based  upon.  Default  is  250  due  to  unreliability  of  matrix  inversion 
routine  for  values  larger  than  250.  Only  applies  to  Krige  program. 

Var.Angle=  -  Specifies  the  angle  at  which  to  generate  a  semivariogram.  Only 
applies  to  the  V  or  fit  program. 

Expansion-Factors  -  Specifies  the  amount  by  which  to  expand  the  image  in 
the  X  and  y  directions.  Increases  the  number  of  x  and  y  increments  and  the  semi¬ 
inclusion  distanc«s  as  indicated  under  the  sections  of  the  same  name.  Only  applies 
to  the  Krige  progran. 

Invert-Output  -  Instructs  program  to  output  points  by  columns  instead  of  by 
rows.  Only  applies  to  Krige  program. 

Number-Of-Header-Lines=  -  Informs  program  of  how  many  lines  to  skip,  as 
header  lines,  in  the  data  file  before  reading  data. 

Trend-Plus-Weightss  -  Instructs  program  on  what  type  of  local  drift  to  remove 
while  kriging.  Default  value  is  three.  If  illegal  value  provided  then  default  is  used. 
Legal  values  are: 

1  -  Do  not  remove  any  trend, 

3  -  Remove  linear  trend,  or 

6  -  Remove  quadratic  trsnd. 

Only  applies  to  Krige  program. 
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Use-LargestJ)ifference  -  Instructs  program  to  calculate  actual,  not  theoretical, 
estimation  variance  based  on  the  difference  between  the  estimate  and  the  actual 
value.  Only  applies  to  Krige  program  doing  minimization. 

Do-Shehang  -  Instructs  tb'^  program  to  select  minimal  data  set  by  minimizing 
the  entire  data  set  and  not  by  using  pattern  replication.  Much  slower  this  way.  Only 
applies  to  Krige  program  doing  minimization. 

PrinLlnterval=  -  Instructs  the  program  on  how  often  the  '^Status. Report"  file 
is  to  be  updated.  Can  save  considerable  time  since  program  may  be  input/output 
bound  doing  normal  kriging  or  minimizing  not  using  pattern  replication.  The  value 
of  the  input  deter.mines  how  many  points  are  estimated  before  the  status  file  is 
updated.  Only  applies  to  Krige  program. 
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A. 2  Example  Krige  Input  Files 
Values  different  for  each  run 

Maximum_Variance=  85.0 
Minimal.Dataset_Filename-kriged85.min 
Output _Filename*kriged85 . surf 
V2uriance_Filename=kriged85 . variance 
Error_Filename»kriged85 .out 

Common  values  for  all  runs  on  this  dataset 

A(0)=  102.994,  C(0)*  571.15,  C0(0)=  32.231 
A(90)»  95.117,  C(90)=  312.925,  C0(90)=  75.038 
Print.Interval*  25 

Common  values  for  all  runs  on  all  datasets 

Xmin=  0.0,  Xraeuc*  266 . 0 ,  Ymin*  0.0,  Ymeuc*  267 . 0 

Number_0f_X_Increments=  267,  Number_0f_Y_ Increments*  268 

Minimizing 

Pattern.Output 

Gridded.Data 

Integer.Data 

Data_Filename=mri .data 

End.Of .Input .Values 
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Values  different  for  each  run 


Largest.Dif f erence®  300 . 0 
Minimal_Dataset_Filenaine=kriged3.min 
Output  _Filenajne*kriged3 .  surf 
Variance_Filenaaie=kriged3  .variance 
Error_Filename=kriged3 . out 
IAX=  3,  IAY=  3 

Common  values  for  all  runs  on  this  dataset 

A(0)=  117.257,  C(0)=  814.124,  C0(0)=  0.0 
A(90)=  96.703,  C(90)=  328.165,  C0(90)=  62.293 


Common  values  for  all  runs  on  all  datasets 

Xrnin®  0.0,  Xmax=  266.0,  Ymin=  0.0,  Ymax=  267.0 

Number_0f_X_Increments=  267,  Number_0f_Y_Increment8=  268 

Minimizing 

Gridded.Data 

Integer.Data 

Data_Filename=mri .data 

End_0f .Input .Values 
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A. 3  Example  Varfit  Input  Files 

Xmin«  0.0,  Xmax=  266.0,  Ymin=  0.0,  Ymax=  267.0 

Nuinber_Of_X_Increments=  267,  Nuinber_Of_Y_Increments=  268 

Total_Nuniber_0f_Point8=  71556 

Gridded.Data 

Integer.Data 

Data_Filename=mri .data 

Output .Filename®. ./mri. repeat 

Variogram_Filename=mri . vario 

Plot_Filename®mri .plot 

Xstep®  1.0,  Ystep®  1.0,  Var.Step®  1.0 

MaxLag®  380 

Error_Filename=mri . out 

End.Of .Input .Values 
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Appendix  B.  Semivariogram  Plots 


This  appendix  includes  the  experimental  and  spherical  theoretical  Semivari¬ 
ogram  plots  for  the  0°  and  90°  Directions  for  each  slice  of  MRI  data  used  in  the 
kriging  procedures. 
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Figure  B.4.  Slice  3  Semivariogram  in  the  90°  Direction 
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Figure  B.o.  Slice  6  Semivaricgram  in  the  O”  Direction 
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Figure  B.7.  Slice  9  Semivariogram  in  the  0°  Direction 
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Figure  B.IO.  Slice  12  Semivi 
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Figure  B.12.  Slice  15  Semivariogram  in  the  90°  Direction 
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Figure  B.13.  Slice  18  Semiv< 
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Figure  B.14.  Slice  18  Semivariogram  in  the  90°  Direction 


Figure  B.17.  Slice  24  Semivariogram  in  the  0°  Direction 
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Figure  B.22.  Slice  24  Background  Partition  Semivariogram  in  the  90°  Direction 
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Appendix  C.  Photographs 


This  appendix  contains  the  photograghs  showing  the  results  of  the  minimiza¬ 
tion  process.  Each  photogragh  contains  a  series  of  images.  Each  series  of  images 
has  the  original  data  set  on  the  left,  the  kriged  image  in  the  center,  and  the  error 
between  the  original  and  kriged  image  on  the  right. 
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Figure  C.5.  Slice  3,  Semi-Inclusion  Distance  =  5,  4,  3 
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emi-Inclusion  Distance  =  5,  4,  3 
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Figure  C.19.  Slice  18,  Maximum  Variance  =  100,  85,  70 
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Figure  C.20.  Slice  18,  Semi-! 
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Distance  =  5,  4,  3 


Figure  C.23.  Slice  21,  Semi-Inclusion  Distance  =  5,  4,  3 
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Figure  C.24.  Slice  21,  Semi-Inclusion  Distance  =  3,  2,  ’ 


Figure  C.25.  Slice  24,  Maximum  Variance  =  100, 
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Figure  C.26.  Slice  24,  Semi-Inclusion  Distance  =  5,  4,  3 
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Appendix  D.  Minimized  Image  Tables 

Phis  appendix  includes  the  kriged  image  statistics  for  each  slice  of  MRI  data 
used  in  the  kriging  procedures.  The  run  times  were  for  all  jobs  running  concurrently. 
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Table  D.l.  Slice  1  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

Allowed 

Variance 

100 

98.98 

98.70 

00:49:31 

1(a) 

85 

84.39 

96.53 

00:38:15 

MII11 

70 

68.37 

74.90 

00:54:45 

mmmM 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

5,  5 

91.56 

97.88 

00:07:26 

mmsm 

4,4 

85.72 

96.77 

00:05:20 

2(b) 

3,  3 

79.72 

94.34 

00:04:48 

2(c),  3(a) 

EH 

73.50 

87.36 

00:06:57 

mmsam 

1,1 

66.92 

50.00 

00:13:58 

3(c) 

Table  D.2.  Slice  3  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

Allowed 

Variance 

100 

99.42 

99.39 

00:54:48 

4(a) 

85 

83.44 

98.91 

00:39:52 

wmsam 

70 

69.74 

97.80 

00:37:23 

BKn»i 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

5,  5 

70.27 

97.88 

00:03:40 

■i^asi 

ESI 

63.73 

96.77 

00:03:42 

5(b) 

3,  3 

57.09 

00:04:06 

5(c),  6(a) 

w 

50.24 

87.36 

00:04:36 

MMB 

1,1 

42.99 

50.00 

00:10:40 

6(c) 

Table  D.3,  Slice  6  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

Allowed 

Variance 

100 

96.82 

99.38 

00:54:26 

7(a) 

85 

84.16 

98.94 

00:42:16 

7(b) 

70 

69.17 

97.10 

00:37:51 

7(c) 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

E[Q 

73.46 

97.88 

00:05:41 

8(a) 

4,4 

68.12 

96.77 

00:06:04 

8(b) 

3,  3 

62.64 

94.34 

00:08:27 

8(c),  9(a) 

2,2 

56.98 

87.36 

00:05:41 

9(b) 

1,  1 

50.97 

50.00 

00:16:25 

9(c) 
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Table  D.4.  Slice  9  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

100 

98.69 

98.33 

00:45:04 

10(a) 

Allowed 

85 

82.09 

93.63 

00:39:55 

10(b) 

Variance 

70 

69.18 

25.09 

01:34:58 

10(c) 

95.44 

97.88 

00:03:33 

11(a) 

89.79 

96.77 

00:03:46 

11(b) 

Inclusion 

83.99 

94.34 

00:04:42 

11(c), 12(a) 

Distances 

wi 

77.97 

87.36 

00:05:51 

12(b) 

lAX,  lAY 

1,1 

71.61 

50.00 

00:19:48 

12(c) 

Table  D.5.  Slice  12  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

100 

97.13 

99.60 

00:55:45 

13(a} 

Allowed 

85 

84.39 

99.32 

00:42:42 

I3(b) 

Variance 

70 

69.52 

98.67 

00:38:55 

13(c) 

Initial 

5,  5 

62.49 

97.88 

00:03:57 

14(a) 

Semi- 

4,4 

57.16 

96.77 

00:04:18 

14(b) 

Inclusion 

3,  3 

51.72 

94.34 

00:04:41 

14(c),15(a) 

Distances 

2,  2 

46.10 

87.36 

00:06:01 

15(b) 

lAX,  lAY 

1,1 

40.13 

50.00 

00:19:41 

Table  D.6 

.  Slice  15  Statistics 

Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

100 

98.92 

99.84 

01:17:16 

16(a) 

Allowed 

85 

83.58 

99.78 

00:58:32 

16(b) 

Variance 

70 

70.00 

99.70 

00:30:11 

16(c) 

Initial 

5,5 

26.64 

97.88 

00:04:07 

17(a) 

Semi- 

ESI 

21.90 

96.77 

00:05:07 

17(b) 

Inclusion 

3,  3 

17.16 

94.34 

00:05:42 

17(c),18(a) 

Distances 

2,  2 

12.42 

87.36 

00:09:26 

18(b) 

lAX,  lAY 

1,1 

7.65 

00:34:58 

18(c) 
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Table  D.7.  Slice  18  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

Allowed 

Variance 

.00 

98.20 

99.91 

02:08:07 

19(a) 

85 

84.64 

99.90 

01:05:45 

19(b) 

70 

69.81 

99.84 

00:34:25 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

17.12 

97.88 

00:03:49 

20(a) 

ESI 

13.69 

96.77 

00:04:09 

20(b) 

ESI 

10.27 

94.34 

00:04:38 

20(c), 21(a) 

EH 

6.84 

87.36 

00:06:04 

S(b5 

1,1 

0.42 

50.00 

00:15:32 

21(c) 

Table  D.8.  Slice  21  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr;min:s 

Photograph 

Maximum 

Allowed 

Variance 

100 

99.92 

99.96 

01:36:19 

22(a) 

85 

84.67 

99.95 

01:39:50 

22(b) 

70 

69.45 

99.93 

01:27:51 

22(c) 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

5,  5 

9.17 

97.88 

00:04:07 

23(a) 

4,4 

7.34 

96.77 

00:04:48 

23(b) 

3,  3 

5.50 

94.34 

00:05:40 

23(c),24(a) 

EH 

3.67 

87.36 

00:09:25 

1,1 

1.83 

50.00 

00:34:57 

24(c) 

Table  D.9.  Slice  24  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
hr:min:s 

Photograph 

Maximum 

Allowed 

Variance 

100 

53.15 

99.96 

01:42:16 

25(a) 

85 

53.15 

99.96 

02:02:15 

25(b) 

70 

53.15 

99.96 

01:41:20 

25(c) 

Initial 

Semi- 
Inclusion 
Distances 
lAX,  lAY 

5,  5 

3.94 

97.88 

00:04:06 

26(a) 

ESI 

3.15 

96.77 

00:04:25 

26(b) 

3,  3 

2.36 

94.34 

00:05:18 

26(c), 27(a) 

2,  2 

1.57 

87.36 

00:07:36 

27(b) 

1,  1 

0.79 

50.00 

00:20:29 

27(c) 
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Table  D.IO.  Slice  24,  Partitioned,  Statistics 


Minimization 

Method 

Largest 

Variance 

Percent 

Reduction 

Run  Time 
br;min;s 

Photograph 

Maximum 

Allowed 

Variance 

13 

12.45 

96.65 

00:08:00 

28(a) 

12 

11.83 

95.73 

00:08:23 

28(b) 

11 

10.77 

91.04 

00:10:58 

28(c) 
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Appendix  E.  Programmers  Manual 


This  appendix  provides  a  description  of  the  implementation  of  the  routines 
needed  for  each  program  with  explanations  of  how  and  why  the  coding  is  structured 
as  presented. 
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E.l  Matrix  Object 

To  minimize  a  data  set,  the  kriging  program  requires  a  matrix  that  can  expand 
and  contract  as  points  axe  added  to  and  deleted  from  the  minimal  data  set.  Typical 
matrix  operations  must  also  be  supported.  This  can  not  be  done  using  typical 
programming  techniques.  What  is  needed  is  an  object  oriented  language.  The  matrix 
object  is,  therefore,  written  in  C++  and  supports  the  following  requirements: 

•  Create  any  size  matrix, 

•  Delete  a  matrix  and  free  memory, 

•  Set  an  element  of  the  matrix  to  a  desired  value, 

•  Get  the  value  of  an  element  of  the  matrix, 

•  Add  a  row  at  the  bottom  of  the  matrix, 

•  Add  a  column  at  the  right  edge  of  the  matrix, 

•  Delete  a  row  from  the  bottom  of  the  matrix, 

•  Delete  a  column  from  the  right  edge  of  the  matrix, 

•  Transpose  the  matrix, 

•  Invert  the  matrix, 

•  Reinvert  the  matrix  after  adding  rows  and  columns, 

•  Premultiply  the  matrix  by  another  matrix, 

•  Postmultiply  the  matrix  by  another  matrix, 

•  Multiply  the  matrix  by  a  scalar  value, 

•  Add  another  matrix  to  this  matrix, 

•  Copy  one  matrix  to  another, 

•  Return  the  number  of  rows  of  the  matrix, 

•  Return  the  number  of  columns  of  the  matrix. 
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•  Test  if  last  operation  was  successful. 


Implementation  of  most  of  these  routines  is  straight  forward  and  will  not  be 
discussed.  Two  additional  routines,  LUDCMP  (renauned  LU .Decomposition)  and 
LUBKSB  (renamed  LU .Backsubstitute),  were  adapted  from  Numerical  Recipes 
in  C  (16).  LU JDecomposition  decomposes  the  matrix  into  the  product  of  two 
matrices,  [L]  and  [U],  where  [L]  is  lower  triangular  and  [U]  is  upper  triangular. 
LU .Backsubstitute  performs  back  substitution  on  the  decomposed  matrix.  The 
combination  of  these  programs  provided  an  efficient  method  for  inverting  the  ma¬ 
trix.  Of  primary  interest  is  the  implementation  of  the  object  such  that  it  has  the 
ability  to  expand  and  contract. 

The  matrix  object  is  implemented  as  a  structure  -'ontaining  information  about 
the  matrix,  such  as  number  of  rows  and  number  of  columns,  zmd  a  pointer  to  an 
array  of  arrays  in  which  the  values  are  stored.  The  structure  is  illustrated  in  Fig¬ 
ure  E.l  which  was  adapted  from  the  matrix  object  source  code  file.  Each  individual 
array,  of  the  array  of  arrays,  is  sized  as  the  minimum  of  100  or  the  number  of  rows 
requested  by  the  user  for  the  row  size  of  the  array,  and  as  the  minimum  of  100  or 
the  number  of  columns  requested  by  the  user  for  the  column  size  of  the  array.  This 
implementation  represents  a  compromise  between  excessive  unused,  but  allocated, 
memory  cind  excessive  requests  for  memory  with  more  overhead  per  value  stored.  By 
allocating  large  amounts  of  memory  per  array,  the  number  of  requests  for  memory 
and  the  administrative  overhead  keeping  track  of  those  allocations  is  minimized.  But 
the  amount  of  unused,  but  allocated,  memory  becomes  excessive  and,  if  carried  to 
the  extreme,  can  require  more  memory  than  is  available.  By  allocating  only  enough 
memory  for  each  row  or  column,  unused  memory  is  minimized,  but,  the  number 
of  requests  for  memory  and  the  administrative  overhead  keeping  track  of  all  those 
allocations  becomes  excessive.  Using  the  structure  previously  outlined,  rows  and 
columns  can  be  added  by  allocating  more  arrays  of  memory.  The  number  of  rows 
and  columns  can  be  reduced  very  easily  by  adjusting  the  matrix  information  kept  in 


Figure  E.l.  Matrix  Object  Structure 

the  matrix  structure.  This  does  not  return  memory  to  the  system  but  does  appear, 
to  the  user,  as  if  the  rows  and  columns  are  deleted.  Of  several  structures  investigated 
this  one  appeared  to  be  the  closest  to  optimal.  The  matrix  object  is  also  used  in  the 
partitioning,  residuals  removal,  and  semivariogram  estimating  programs. 

Of  secondary  interest  is  the  implementation  of  the  ReInvert  routine.  This  rou¬ 
tine  uses  Szidarovszky’s  matrix  partitioning  method  to  reinvert  the  matrix  (19:192). 
Given  the  two  matrices,  [A]  and  [B],  partitioned  into  submatrices  as  follows: 


A  = 


P  Q 
R  S 


B  = 


Y  Z 
U  V 
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where  [A]  •  [B]  =  [/]  and  P~^  is  known.  Performing  the  multiplication  can  be  seen 
that: 


p.Y  +  QU  =  I 

(E.1) 

P-Z  +  Q-V  =  0 

(E.2) 

R‘Y  +  S-U  =  0 

(E.3) 

R‘Z  +  S‘V  =  I 

(E.4! 

solving  the  second  equation  for  Z: 

Z  =  -P-' .  Q  •  V 

combining  this  with  the  fourth  equation  and  solving  for  V: 

V  =  •(?)-* 

solving  the  first  equation  for  Y : 

y  =  •  (/  -  Q  •  t/)  =  p-*  -  (p-* .  Q)  •  t/ 

combining  this  with  the  third  equation  yields: 

U  =  -{S-RP-'’Q)-'RP-'  =  -y.p.p-i 
therefore,  the  following  relationships  are  established: 

V  =  {S-R-P-^ -Q)-^ 

Z  =  -P~^.Q.V 

U  =  -V-RP-^ 

Y  =  p-' -  (p-> .  g) .  (/ 
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If  the  number  of  added  rows  and  columns  is  small  relative  to  the  original  matrix  size, 
the  computational  time  saved  by  not  inverting  the  entire  matrix  will  be  substantial. 

E.2  Interface  Routines 

In  support  of  the  kriging  procedure’s  need  for  a  common  interface  that  isolates 
the  input  data  format  and  control  information  format  from  the  kriging  and  structural 
anzJysis  procedures,  an  interface  routines  package,  'vritten  in  C++,  is  provided.  The 
following  are  the  routines  provided  for  that  standcirdized  interf<ice. 

Input-Check.  The  number  of  command  line  arguments  is  tested  to  determine 
if  a  control  information  file  name  is  provided  and  print  usage  information  if  not. 

Get-Data-Point.  Reads  the  data  from  the  disk  and  assigns  each  point  to  a 
grid  location.  Returns  row  and  column  indices  of  the  grid  location  as  well  as  the 
coordinates  of  the  point  and  the  quantity  of  interest. 

PutJ)ata.Point.  Writes  the  data  to  the  output  file  in  the  desired  format.  Row 
and  column  indices  of  the  grid  location  as  well  as  the  coordinates  of  the  point  and 
the  quantity  of  interest  are  sent  as  parameters. 

Output-Point.  Writes  the  data  to  the  minimal  data  file  in  the  desired  format. 
Row  and  column  indices  of  the  grid  location  as  well  as  the  coordinates  of  the  point 
and  the  quantity  of  interest  are  sent  as  parameters. 

Put-Variance.  Writes  the  estimation  variance  to  the  variance  file  in  the  desired 
format.  Row  and  column  indices  of  the  grid  location  as  well  as  the  coordinates  of 
the  point  and  the  variance  are  sent  as  parameters. 

Get-ControLInput.  This  routine  gets  the  program  control  information  and 
opens  the  data  files.  The  following  control  parameters  cam  be  set  by  the  user: 

•  Number  of  increments  on  the  x  and  y  axes, 

•  Maximum  values  of  x  and  y, 
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•  Minimum  values  of  x  and  y, 

•  Step  size  for  x  and  y, 

•  Zone  of  influence  (range),  a,  in  0°  and  90*^  directions, 

•  Sill  minus  the  nugget  effect,  C,  in  0^^  and  90°  directions, 

•  Nugget  effect,  CO,  in  0  and  90  degree  directions, 

•  Angles  at  which  to  generate  semivariograms, 

•  Correlations  generated  by  semivariogram  program  for  each  theoretical  model, 

•  The  semi-inclusion  distance  for  estimation  or  minimizing, 

•  Maximum  allowable  variance,  used  for  data  minimization  , 

•  Confidence  interval  co  use  calculating  maximum  allowable  variance, 

•  Largest  allowable  difference  of  estimate  from  actual  value, 

•  Minimum  distance  between  points  to  consider  them  distinct, 

•  Minimization  flag, 

•  Integer  data  flag, 

•  Input  data  file  name, 

•  Output  file  name, 

•  Variance  file  name, 

•  Minimal  data  set  file  name, 

•  Plot  file  name, 

•  Semivariogram  file  name, 

•  Gridded  data  flag, 

•  Pattern  output  flag, 

•  Maximum  number  of  points  allowed  in  [A]  matrix, 

•  Polynomial  coefficients. 
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•  Semivariogram  inclusion  angle  width, 

•  Semivariogram  step, 

•  Total  numbe'  of  data  points, 

•  Maximum  lag  for  semivariogram  calculation, 

•  Image  expansion  factor, 

•  Output  inversion  flag,  and 

•  Number  of  header  lines  in  data  file. 

The  parameter  input  names,  usage,  and  the  actions  taken  based  on  those  pa¬ 
rameters  are  covered  in  detail  in  Appendix  A,  The  User’s  Manual. 

E.S  Kriging  Routines 

To  perform  the  tasks  required  by  this  thesis  effort  and  the  two  concurrent 
thesis  efforts,  a  kriging  program  originally  written  in  C  by  Michael  Grant  (9)  was 
rewritten  in  C-f-1-  with  extensive  modifications.  The  modifications  correct  some 
programmatic  errors  as  well  as  incorporate  geometric  anisotropy,  minimal  data  set 
production,  and  generalization  to  any  data  set.  The  program  is  long  and  somewhat 
complex  but  is  composed  of  short  and  easy  to  understand  routines.  Therefore,  to 
enhance  understandability,  each  routine  developed  will  be  presented  separately.  Only 
those  routines  that  directly  support  kriging  will  be  described  in  this  section.  Those 
routines  that  support  minimization  will  be  presented  in  the  minimization  section. 
Those  routines  that  have  elements  of  both  kriging  and  minimization  will  be  presented 
in  both  sections.  Note  that  a  point  is  “kriged”  if  the  estimate  and  the  estimation 
variance  at  that  point  are  calculated. 

Executive  Routine.  The  primary  use  of  the  kriging  equations  is  the  calculation 
of  the  estimate  and  the  estimation  variance  at  a  point.  This  involves  solving  for  the 
weights,  {A"},  from  the  kriging  system  of  equations  [A]  •  {A}  =  {5}.  To  build  the 
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matrices,  the  known  points  in  the  vicinity  of  the  point  being  kriged  must  be  found. 
Once  the  weights  are  known  the  point  can  be  kriged.  Therefore,  the  heart  of  the 
kriging  program  is  the  following  routine  calls: 

•  Get_Pts(i,  j) 

•  BuildJ^O 

•  Build_B(i,  j) 

•  BuildJCO 

•  Estimate(i,  j,  Minimization-Flag) 

For  this  effort,  the  points  to  be  estimated  lie  on  a  regularly  spaced  grid.  This 
allows  the  coordinates  of  a  point  to  be  specified  by  the  indices,  i  and  j,  of  a  two 
dimensional  array.  The  routines  that  need  to  know  the  coordinates  of  the  point  being 
kriged  are  passed  the  array  indices  from  which  the  coordinates  can  then  be  calculated. 
This  sequence  needs  some  modification,  however,  for  this  application.  First,  a  point 
without  known  points  in  the  neighborhood  on  which  to  base  an  estimate  or  a  point 
that  is  all  ready  known  can  not  be  kriged.  Therefore,  Get-Pts  will  return  zero  if  the 
point  is  known  or  if  there  are  no  points  in  the  neighborhood.  If  minimization  is  being 
performed,  the  presence  of  a  point  at  the  grid  location  can  be  determined  directly. 
The  variable  Unknown^Point  is  set  to  true  if  the  point  is  to  be  kriged.  The  other 
routine  calls  are  then  placed  within  a  block  i  f  statement  testing  on  Unknown. Point. 
A  second  modification  involves  the  routine  to  build  the  [A]  matrix.  This  routine  is 
called  only  if  minimization  is  NOT  being  performed.  This  will  be  discussed  in  the 
mininaization  section. 

The  kriging  routines  need  to  be  exercised  for  all  the  grid  locations.  This  is 
accomplished  with  two  nested  for  loops.  The  indices  of  the  for  loops  are  variables 
so  that  subsections  of  the  entire  grid  may  be  kriged.  This  will  be  very  useful  for 
minimization  as  will  be  explained  later.  Minimization  also  requires  accomplishing 
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the  doubly  nested  for  loops  after  a  known  data  point  is  added  to  or  removed  from 
the  minimal  data  set.  Therefore,  these  two  loops  need  to  be  nested  within  a  construct 
that  is  executed  at  least  once  for  ordinary,  non-minimizing,  kriging.  The  construct 
used  is  the  do  while  loop. 

Before  any  kriging  can  be  accomplished  some  program  set-up  is  required.  First, 
program  control  information  must  be  acquired  using  the  GetJControlJnput  routine. 
Second,  the  data  must  be  read  in  from  the  disk  using  the  DataJn  routine.  Lastly, 
data  structures  must  be  allocated  and  variables  initialized.  Initialized alues  sets 
variables  used  for  minimization. 

After  the  data  is  kriged  it  is  written  to  the  disk  with  the  Output  routine.  The 
pattern  of  known  points  may  also  be  printed.  Within  the  nested  loops  is  a  print 
section  to  write  to  a  file  named  “‘Status. Report  the  current  location  in  the  data 
field  and  some  other  values  of  interest. 

DataJn.  The  construct  used  to  hold  the  data  points  is  an  array  of  pointers  to  a 
structure  that  contains  the  coordinates  of  the  point  and  the  value  of  interest  as  well 
as  a  pointer  to  the  next  point  collocated  at  this  grid  location.  This  is,  in  essence,  an 
array  of  bins  in  which  to  place  the  data.  This  construct  is  used  since  many  points 
from  an  irregularly  spaced  data  field  may  be  assigned  to  the  same  grid  location.  For 
minimization  two  of  these  constructs  are  used,  one  to  contain  all  the  data  and  one 
to  contain  the  minimal  data  set. 

To  get  the  data  from  disk  a  while  loop  is  used.  The  data  is  read  from  the 
disk  and  assigned  to  a  grid  location  by  the  interface  routine  GetJDataJPoint.  When 
all  the  data  is  read  in  GetJDatadoint  returns  the  End  Of  File  (EOF)  value  and 
terminates  the  while  loop.  A  small  amount  of  memory  is  allocated  for  each  datum 
read  in  and  kept.  Not  all  data  will  be  kept.  Duplicate  data  points  are  removed 
because  they  create  an  equation,  in  the  kriging  system  of  equations,  that  is  not 
linearly  independent  of  the  others.  If  a  matrix  contains  some  equations  that  are 
linear  combinations  of  others  in  the  matrix,  the  matrix  is  singular  and  can  not  be 
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inverted.  If  the  matrix  can’t  be  inverted  the  weights  can’t  be  calculated  and  the 
point  can’t  be  estimated. 

GeLPts.  This  routine  is  in  two  major  sections:  get  the  points  at  the  grid 
location  being  kriged  and  get  the  points  within  a  square  neighborhood  centered 
around  the  point  being  kriged.  The  square  must  be  as  large  as  possible  subject  to 
the  following  two  constraints:  1)  the  total  number  of  points  must  not  exceed  the 
number  at  which  the  matrix  inversion  routine  becomes  unreliable;  and,  2)  the  size 
of  the  square  must  not  exceed  the  size  set  by  the  user.  To  accomplish  this,  points 
are  added  from  a  square  that  is  only  one  row  and  column  larger  above,  below,  to  the 
left,  and  to  the  right  than  the  last  square.  This  is  repeated  until  the  point  limit  or 
square  size  limit  is  reached.  This  then  reduces  to  adding  the  points  from  the  edges 
of  an  “expanding”  square,  centered  around  the  point  being  kriged,  that  starts  with 
three  rows  and  three  columns  and  ends  when  one  of  the  limit  conditions  is  reached. 

The  first  section  is  a  while  loop  that  gets  all  the  points  at  the  current  grid 
location.  Within  the  loop  is  a  test  to  determine  if  one  of  the  points  coincides  with 
the  grid  location.  If  so,  that  point  is  used  and  kriging  is  not  performed.  The  routine 
returns  zero  in  this  Ccise. 

The  second  section  is  an  outer  for  loop  that  increments  the  distance  from  the 
central  point  to  the  edges  of  the  expanding  square.  Within  this  loop  are  four  loops 
to  get  the  points  from  each  edge  of  the  square.  These  four  loops  are  very  similar  with 
only  minor  differences  to  ensure  that  the  same  point  is  not  sampled  twice.  Therefore, 
only  one  of  the  loops  will  be  described.  Each  loop  starts  by  calculating  the  range 
for  the  indices  of  the  for  loop.  The  range  is  tested  to  make  sure  it  does  not  fall 
outside  the  grid.  The  range  is  also  tested  to  make  sure  that  the  points  on  that  edge 
have  not  all  ready  been  sampled.  A  loop  almost  identical  to  that  in  the  first  section 
is  then  performed  on  each  location  on  that  edge.  The  differences  between  this  loop 
and  the  first  are:  points  are  not  tested  to  determine  if  they  coincide  with  the  point 
being  kriged  and  the  distance  from  the  point  being  kriged  is  calculated  so  that  the 
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point  will  be  included  only  if  it  is  within  the  theoretical  semivariogram  range. 

The  points  are  placed  in  an  expanding  matrix.  This  is  done  so  that  the  number 
of  points  to  use  for  kriging  can  be  set  by  setting  the  variable  Kpts.  The  number 
of  points  added  to  the  matrix  upon  completion  of  the  loop  to  add  points  from  the 
last  edge  of  the  previous  square  is  saved  in  the  variable  LastJKpts.  If  the  number  of 
points  added  to  the  matrix  from  the  current  edge  of  the  expanding  square  exceeds 
the  maximum  set  by  the  user  then  the  number  of  points  to  use  for  kriging  is  set 
to  LastJKpts.  There  is  a  maximum  of  approximately  250  points  that  the  current 
matrix  inversion  routine  can  reliably  handle. 

Build. A.  This  routine  is  in  five  sections  and  is  relatively  easy  to  follow.  The 
first  section  resizes  [A]  to  the  necessary  dimensions.  The  second  section  uses  a 
doubly  nested  for  loop  to  cycle  through  the  sample  points  matrix  and  determine 
the  covariance  between  each  pair  of  known  points  and  then  put  that  value  into  the  [A] 
matrix.  The  Sigma  routine  calculates  the  covariance  bcised  on  the  distance  between 
the  points.  The  third  section  adds  a  row  and  column  of  ones  so  that  the  weights  will 
sum  to  one.  The  fourth  section  adds  the  terms  to  remove  the  local  drift.  The  drift 
is  cissumed  linear.  The  terms  are  calculated  based  upon  a  relative  origin  located  at 
the  upper  left  corner  of  the  area  being  kriged.  This  is  necessary  for  minimization 
and  will  be  explained  in  the  minimization  section.  The  fifth,  and  last,  section  sets 
the  block  of  zeros  required  by  the  matrix  form  of  the  kriging  system  of  equations. 
After  the  matrix  is  constructed  the  inverse  is  calculated. 

Build.B.  This  routine  is  in  three  sections  and  is  also  easy  to  follow.  The  first 
section  resizes  {B}  to  the  necessary  dimensions.  The  second  section  uses  a  /or  loop 
to  cycle  through  the  sample  points  matrix  and  determine  the  covariance  between 
each  known  point  and  the  point  to  be  kriged  and  then  put  that  value  into  the  {B} 
matrix.  The  Sigma  routine  calculates  the  covariance  based  on  the  distance  between 
the  points.  The  third  section  puts  the  one  for  the  sum  of  the  weights  and  the  values 
for  the  local  drift  into  the  matrix.  The  values  for  the  local  drift  are  calculated  based 
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on  the  relative  origin  described  in  the  previous  section. 


Build-X.  The  {B}  matrix  is  copied  into  the  {A"}  matrix  and  the  {A}  matrix 
is  then  premultiplied  by  the  [A]~^  matrix.  There  is  more  coding  to  minimize  com¬ 
putational  time  if  minimization  is  being  performed  and  will  be  discussed  under  the 
minimization  section. 

Estimate.  The  estimate  and  estimation  variance  is  calculated  using  the  equa¬ 
tions  presented  in  the  literature  review  and  stored  into  the  surface  and  variance 
matrices.  If  the  semivariogram  is  used  in  plaee  of  the  covariance  in  the  Sigma  rou¬ 
tine  the  formula  for  the  estimation  variance  must  be  changed  to  the  semivariogram 
form. 

Output.  The  surface  estimates,  estimation  variances,  and,  if  produced,  the 
minimal  data  set  are  written  to  the  disk  using  the  appropriate  interface  routine. 
Doubly  nested  for  loops  are  used  to  cycle  through  the  matrices.  There  is  some 
extra  coding  to  reverse  the  output  format  of  the  data  if  the  user  so  desires. 

Sigma.  The  covariance  for  the  spherical  theoretical  semivariogram  model  is 
calculated  using  the  equation  presented  in  the  literature  review  that  corrects  for 
geometric  anisotropy.  An  average  sill  is  calculated  in  the  executive  routine  and 
used  in  this  routine.  Zonal  anisotropy  is  beyond  the  scope  of  this  effort  and  is 
not  accounted  for  in  the  covariance  calculation.  This  can  be  changed  to  return  the 
semivariogram  without  impacting  any  routine  other  than  the  Estimate  routine. 

minO.  This  routine  returns  the  smaller  of  the  two  integer  parameters. 

The  following  routines  are  added  to  support  minimization. 

AddSoints.  This  routine  adds  points  based  on  the  semi-inclusion  distances 
and  fills  the  sample  matrix  with  the  data  that  is  within  the  current  kriging  pattern. 
The  surface  value  at  each  added  point  location  is  set  to  the  value  of  the  quantity  of 
interest  of  that  point  and  the  estimation  variance  at  that  point  is  set  to  zero.  All 
this  is  accomplished  very  simply  with  doubly  nested  for  loops. 
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Initialize.  Values.  This  routine  adds  the  initial  points  to  the  minimal  data  set 
based  on  the  semi-inclusion  distances.  If  the  user  does  not  set  the  semi-inclusion 
distances  they  will  be  calculated,  in  the  executive  routine,  based  on  the  theoretical 
semivariogram  range.  The  calculation  is  based  on  a  square  kriging  area  with  the 
distance  between  the  center  point  and  a  corner  point  being  just  less  than  the  theo¬ 
retical  semivariogram  range.  The  user  may  input  the  number  of  times  to  divide  the 
semi-inclusion  distance  by  two.  This  is  useful  if  the  semivariogram  range  is  large 
and  the  maximum  variance  is  small  because  it  reduces  the  size  of  the  kriging  area  to 
start  with.  This  routine  also  determines  if  the  entire  matrix  is  to  be  minimized  or  if 
submatrices  are  to  be  minimized  and  sets  the  minimization  flag  appropriately. 

Test.Set.And.Add.  This  routine  is  long  but  is  in  ten  distinct  and  fairly  indepen¬ 
dent  sections.  The  flow  of  control  is  from  top  to  bottom  and  the  interaction  is  limited 
to  consecutive  sections.  Which  section  is  currently  being  executed  is  controlled  by 
the  variable  Minimization  J'lag  and  indicates  which  submatrix,  or  kriging  area,  is 
currently  being  minimized  or  kriged.  MinimizatimJ'lag  takes  on  the  following 
values  and  meanings: 

-1  -  minimizing  entire  matrix  due  to  large  ran^  . 

0  -  minimizing  square  central  submatrix 

1  -  kriging  central  submatrices 

2  -  kriging  left  edge  submatrices 

3  -  kriging  top  edge  submatrices 

4  -  kriging  right  edge  submatrices 

5  -  kriging  bottom  edge  submatrices 

6  -  kriging  upper  left  corner  submatrix 

7  -  kriging  upper  right  comer  submatrix 

8  -  kriging  lower  left  comer  submatrix 

9  -  kriging  lower  right  corner  submatrix 
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Since  a  matrix  is  used  to  represent  most  of  required  data  constructs  the  phraseology 
is  coached  in  terms  of  matrices.  These  ten  sections  fall  under  two  major  tasks:  add 
points  to  reduce  the  largest  estimation  variance  and  krige  a  subarea  of  the  grid  with 
the  minimal  data  set.  Therefore,  these  two  major  tasks  will  be  addressed  instead  of 
the  ten  sections. 

As  each  submatrix  is  kriged  the  location  eind  value  of  the  largest  estimation 
variance  is  recorded.  If  the  submatrix  equal  to  the  entire  matrix  is  being  minimized 
and  the  largest  variance  is  greater  than  the  maximum  allowed  variance  a  point  is 
added  at  the  recorded  location  and  the  matrix  is  rekriged.  If  a  submatrix  not  equal 
to  the  entire  matrix  is  being  minimized  and  a  point  must  be  added  then  symetric 
points  about  an  x  and  y  axis  through  the  center  point  of  the  kriging  pattern  must 
also  be  added  as  previously  explained.  If  the  kriging  pattern  is  replicated  and  the 
semi-inclusion  distance  is  divisible  by  two  then  the  size  of  the  kriging  area  will  be 
reduced.  If  the  maximum  variance  criteria  is  met  then  set-up  for  the  next  section 
must  be  performed.  If  the  entire  matrix  is  being  minimized  then  minimization  is 
complete  as  the  whole  array  has  all  ready  been  kriged.  If  a  submatrix  is  being 
minimized  then  the  first  kriging  section  must  be  initialized. 

Each  kriging  section,  except  for  the  corner  sections,  must  increment  the  row 
and  column  submatrix  indices  to  the  next  submatrix  to  krige.  If  the  first  subma¬ 
trix  of  the  section  is  being  kriged  then  the  Build-X  routine  must  be  instructed  to 
save  the  weights,  otherwise  Build  Ji.  must  be  instructed  to  use  the  previously  saved 
weights.  Every  section  must  restock  the  sample  array  with  the  points  from  the  cur¬ 
rent  submatrix  being  kriged.  Each  section  must  also  determine  if  all  the  submatrices 
in  that  section  have  been  kriged  and,  if  so,  set-up  the  next  section.  The  section 
kriging  the  central  submatrices  must  also  determine  when  the  last  submatrix  of  the 
current  submatrix  row  is  completed  so  that  the  column  indices  can  be  set  to  the  first 
submatrix  of  the  row  and  the  row  indices  can  be  incremented  to  the  next  submatrix 
row.  The  coding  of  this  routine  should  now  be  relatively  easy  to  understand. 


E-15 


When  minimization  is  finished  the  minimization  flag  is  set  to  false  to  allow  the 
do  while  loop  in  the  executive  routine  to  stop  looping.  The  minimized  flag  is  set  to 
true  so  that  the  Output  routine  will  write  the  minimal  data  set  to  the  disk. 

ReBuild.A.  This  routine  adds  the  extra  terms  to  the  [A]  matrix  required  by 
the  addition  of  extra  points  to  the  kriging  area.  It  is  structured  very  similarly  to 
Builds.  The  difference  is  the  starting  indices  for  the  loops.  Build-A  fills  the  entire 
matrix  whereas  this  routine  adds  to  the  matrix  on  the  right  and  at  the  bottom.  This 
was  done  so  that  the  ReInvert  method  of  the  matrix  object  could  be  used  to  save 
some  computational  time. 

Build. A.  For  [A]“^  of  the  first  kriged  pattern  to  equal  [A]~^  of  subsequently 
kriged  identically  patterned  areas  all  terms  in  the  [A]  matrix  must  be  identical. 
This  includes  the  values  used  to  account  for  the  local  drift.  For  these  values  to  be 
identical  from  pattern  to  pattern  they  must  have  the  same  relative  origin.  Therefore, 
the  upper  left  corner  of  each  kriging  area  is  used  as  the  relative  origin  for  that  area. 
This  origin  is  also  used  in  ReBuild.A  and  Build.B. 

Build.X.  There  are  two  sections  of  code  added  for  minimization.  The  first 
section  uses  the  previously  saved  weights  if  the  flag  to  use  the  saved  weights  is  set 
otherwise  it  calculates  the  weights.  The  second  section  saves  the  weights  if  the  flag 
to  save  the  weights  is  set. 

Calculate.IAXJA  Y.  This  routine  is  implemented  as  a  single  do  while  loop  that 
terminates  when  the  estimation  variance  for  the  center  point  of  a  rectangular  kriging 
pattern  is  smaller  than  the  maximum  allowed  variance.  There  is  an  if  structure 
to  determine  the  correct  number  and  coordinates  of  the  kriging  pattern  based  on 
whether  the  kriging  pattern  is  to  be  used  or  if  normal  kriging  is  to  be  used.  To 
reduce  the  distance  between  the  points  for  each  succesive  paiss  through  the  loop  a 
multiplier  is  used  that  is  decremented  on  each  pass  through  the  loop. 
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